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Preface 


Causation, the relation of cause to effect, has long been recognized as one of the 
most central subjects in philosophy. After a period of relative neglect during 
the era of logical positivism, the late twentieth century has seen a renaissance 
of interest in causation, as one philosopher after another provides a “causal the- 
ory” of this or that phenomenon: reference and meaning, identity and duration, 
perception and knowledge, information and representation. At the same time, 
the development of the formal disciplines, including modal logic (the logic of 
possibility and necessity), probability theory, mereology (the theory of parts 
and wholes), defeasible or “nonmonotonic” logics (developed in the field of ar- 
tificial intelligence to represent commonsense inference), and partial semantics 
(most prominently, the situation theory of Barwise, Perry, and Etchemendy), 
has provided the tools needed for an exact and comprehensive theory of causa- 
tion. 

Up to this point, formal accounts of causation have followed the empiricist 
strictures laid down by David Hume. These accounts of causation force the con- 
cept into the periphery (making the concept of causation dependent on our prior 
understanding of such theoretical machinery as spatiotemporal location, sub- 
junctive conditionals, experience, and empirical knowledge) and consequently 
do not mesh with the causal theories that have become so popular in epistemol- 
ogy and the philosophy of mind, which, by contrast, require causation to play a 
central and non-derivative role. In this book, I construct a non-Humean or real- 
ist theory of causation (employing the technical tools mentioned in the preceding 
paragraph), and I show how this account sheds light on existing causal theories 
and their outstanding problems. In the process, I sketch a metaphysical theory 
that employs relatively few primitive elements and comprises a well-understood 
mathematical theory of these elements and a precise account, in terms of these 
elements, of a wide variety of phenomena, drawn both from our common experi- 
ence and scientific knowledge. These phenomena include information, teleology 
and biological function, mental representation, qualia and mental causation, our 
knowledge of logic, mathematics, and theoretical science, the structure of space 
and time, the identity and duration of physical objects, and the nature and 
objectivity of ethical values. 

I offer what could be called a “naturalistic” account of the normative di- 
mension: the standards of correctness and propriety that are essential to our 
understanding both of intentionality and of ethics. It builds upon and refines 
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recent work on the teleological theory of norms on the part of Dretske, Stampe, 
Millikan, and others. At the same time, the argument of the book is in part 
directed against a narrowly materialistic ontology. I provide seven indepen- 
dent lines of argument for thinking that we need to recognize the existence of 
states other than merely physical states; in particular, we must acknowledge the 
existence of modal facts, including facts of logical, mathematical, and natural 
necessity. By bringing these modal facts within the scope of causation, I explain 
how it is possible for us to gain information about them. Consequently, I am 
able to defend a position that is realist in the sense both of including a version 
of the traditional correspondence theory of truth and of including an ontology 
in which mental states, qualia, numbers and sets, objective norms, and modal 
facts are first-class citizens. 

Acknowledgment is made to the following publishers for their kind permis- 
sion to reprint excerpts from: 


“Teleology as Higher-Order Causation: A Situation-Theoretic Account,” Minds 
and Machines 8 (1998): 559-585. Published by Kluwer Academic Publishers; 
reprinted on pages 82-90, 95-96, 115-116, 135-143, and 203-215. 
“Situation-Mereology and the Logic of Causation,” Topoi 18 (1999). Published 
by Kluwer Academic Publishers; reprinted in chapter 3, pages 35-49. 

“A New Look at the Cosmological Argument,” by Robert C. Koons, American 
Philosophical Quarterly 34 (April 1997), pages 194-199 and 202-207; reprinted 
in chapter 9, pages 146-159. 

“Information, Representation and the Problem of Error,” by Robert C. Koons, 
in Logic, Language, and Computation, edited by J. Seligman and D. Westerstahl, 
published by the Center for the Study of Language and Information, Stanford, 
California, 1996, pages 333-345; reprinted in chapter 11, pages 181-184. 


Work on this book was made possible by a Faculty Research Assignment 
from the University Research Institute at the University of Texas at Austin, 
during the spring semester of 1997. I would also like to thank Michael Dunn, 
the Philosophy Department and the Institute for Advanced Study at Indiana 
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semester. I would also like to thank Anil Gupta, and Gregg Rosenberg, who 
provided very helpful feedback on early drafts of the book. 

Jon Barwise provided the inspiration for the formal framework, situation 
theory, used in this book, and Jon was extraordinarily generous in giving me 
both his time and his encouragement at the inception of the project. Professor 
Barwise was one of the most creative and original philosophers of our time. He 
will be sorely missed. 

My debt to my teachers, including David Charles at Oriel; Robert M. Adams, 
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visor, Tyler Burge, is incalculable. 
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Introduction 


Physicists are currently searching for what they call a “theory of everything.” 
However, it turns out that the “everything” they have in mind falls far short 
of every thing. The physicists’ theory of everything has nothing to say about 
mental phenomena, agency, values, norms, teleology, or intentionality, to men- 
tion but a few. In fact, physicists rarely have much to say about the natures of 
the fundamental elements of their theory: particles, fields, space, and time. 

None of this is surprising, and none of it is a criticism of current physics 
as such. When physicists refer to a coming “theory of everything,” they do so 
(or, at least, the sophisticated ones do so) with tongue in cheek. It is meta- 
physics, and not physics, whose province it is to fashion a theory of everything. 
This book is a work in real, honest-to-God, no-apologies-given metaphysics, but 
metaphysics conducted in a thoroughly scientific spirit. My hope is that it will 
help to stimulate a return to the perennial concerns of philosophy. 


1.1 A Comprehensive Realism 


A class of propositions can be interpreted realistically when two conditions are 
met: 


1. Some of the propositions are evaluated as true or false. 


2. The truth or falsity of the propositions in the class is determined by some 
set of facts, and this set of facts plays an indispensable role in explaining 
our knowledge of the truth or falsity of the propositions in the class. 


The first condition is not sufficient, since the truth values of the propositions 
could be determined by facts about our collective acts of affirmation or projec- 
tion, in which case the propositions could not be interpreted realistically. The 
causal element introduced by the second condition is critical, because it specifies 
a direction of asymmetric dependence: our knowledge depends causally on the 
fact establishing the truth or falsity of the corresponding propositions. This en- 
tails that the facts determining these truth conditions do not include facts about 
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our attitude toward those very propositions, since causal dependency cannot be 
circular. 

I will argue that propositions involving reference to the following things can 
and should be interpreted realistically: 


e Natural properties and relations 
e Situation and event tokens 


Modality and objective probability 


Causal connections 


Numbers 


Proper functions (teleofunctions) 


Mental states 


Secondary qualities 
e Enduring substances 


e Values and norms 


Although my position is one of a comprehensive realism, J give a relatively 
simple and unified picture of the world. The first three items on the list above 
are treated as primitives, but all of the others are explicated in terms of these 
more fundamental entities, properties, and relations. Everything that is posited 
to exist is posited to exist because of some role it plays in the causal network of 
the world. My approach is resolutely non-dualistic: I reject any sort of Cartesian 
or neo-Cartesian postulation of a scientifically inaccessible realm of subjectivity. 

At the same time, I do not start with any a priori or dogmatic requirement. 
My aim has not been to build a theory of the mind that is materialistic or physi- 
calistic or naturalistic. To begin one’s metaphysical inquiry with such dogmatic 
commitments is methodologically irresponsible. We must simply follow the ev- 
idence where it leads. If it leads to materialism, well and good, but if it leads 
away from it (as my own account does in several respects), we must be willing 
to be accountable to the facts, not to philosophical fashion. 

Theories of content, meaning, and representation in terms of causal connec- 
tion have become very prevalent. A number of philosophers have taken causal 
theories of content as reason to be anti-realist about values (Mackie, Harman), 
numbers (Field), and minds (the Churchlands). In my view, the burden of such 
anti-realism is too great for a theory of content to bear. However, if a causal 
theory of content could be devised that vindicated realism about values, num- 
bers, and minds, such a theory would give us the best of both: a plausible, 
informative, and simple account of content, and the accommodation of much of 
our commonsense view of the world. In this book, I will try to develop such a 
theory. 
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1.2 Metaphysical Method 


This book is unapologetically a work of substantive metaphysical theory. For- 
tunately, blind anti-metaphysical prejudice is not as common as it once was. 
Nonetheless, many may legitimately ask for the ground rules of the enterprise. 
In a recent book on causation, Daniel Hausman (1998) proposed five criteria for 
evaluating metaphysical theories: 


1. Intuitive fit 


2. Empirical adequacy, consistency with what we know about the world, 
including our best scientific knowledge 


3. Epistemic access —- the theory should include some account of how we 
could come to know its truth 


4, Superseding competitors —— the theory should incorporate the successes 
of its predecessors 


5. Metaphysical fecundity — the theory should shed light on a variety of 
metaphysical issues 


The only criterion that I would add to the list is that of simplicity or elegance. 
A good metaphysical theory should not be in need of ad hoc rescues or endless 
epicyclic tinkering. 

The principal motivation of my work is that of unification. I aim to provide 
a unified account of intentionality and knowledge, one in which we give exactly 
the same kind of account both for our thought about and knowledge of objects 
and events in space and time, and for our thought about and knowledge of the 
facts of logic, mathematics, laws of nature, and objective chance. We should not 
accept a bifurcated, disjunctive account of thought and of knowledge so long as 
a unified account is possible. The theoretical cost of postulating genuine modal 
facts (as I do) is small in comparison to the benefits of unification. 


1.3. An Alternative to Both Physicalism 
and Mysterianism 


Since causal relations play the fundamental role in my metaphysics, the term 
“causalism” might be an appropriate term for my approach. In recent years, 
others have taken what could be described as an essentially causalist approach 
to the metaphysics of mind, namely Armstrong, Millikan, Dretske, Papineau, 
and Lycan. A causalist theory of mind identifies intentionality with a certain 
kind of causal property (perhaps involving higher-order causal connections), 
and the peculiar qualities of conscious experience are taken to be explicable in 
terms of their intentionality. In all of these cases, causalism is seen as a strategy 
for defending materialism against various objections concerning intentionality 
and consciousness. The opponents of these approaches, including Searle and 
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McGinn, have been labeled the “mysterians,” since they hold that we can ex- 
pect to find no informative account of the nature of intrinsic intentionality or 
consciousness. 

Unfortunately, those participating in these controversies have overlooked the 
fact that causalism is separable from a commitment to physicalism or materi- 
alism. A non-physicalist causalism would include an informative account of 
the nature of mental states without insisting that everything can ultimately be 
explained in terms of atoms and the void. I will argue that all of the extant 
objections to causalist theories of mind are in reality objections to the conjunc- 
tion of causalism with physicalism. A non-physicalist causalism provides the 
resources for an adequate answer to these objections. In addition, I will argue 
that there are independent grounds, having nothing to do with the philosophy 
of mind, for rejecting physicalism. 


1.4 Causal Internalism 


The notion of causality is absolutely central to recent philosophical work in 
semantics, the philosophy of mind and intentionality, epistemology, and philos- 
ophy of science. Work by Donnellan, Kripke (1980), and Putnam (1975) helped 
to make causal connections an indispensable part of our accounts of reference 
and signification. This in turn has generated causal theories of information and 
content by Dretske (1981), Fodor (1990), and others. The Gettier problem led to 
the renaissance of causal theories of knowledge by Goldman (1979), Armstrong 
(1968), Pollock (1986), and Plantinga (1993). Causality is put to much work 
in recent theories of personal identity and of the nature of mental states (as in 
the functionalism of Lewis (1986b) and Putnam (1975)). Causation continues 
to figure prominently in philosophy of science — e.g., Wesley Salmon’s causal 
theory of evidence (Salmon (1984)) — and in theoretical science, both within 
physics and outside. 

Additionally, causal reasoning plays a central role in both understanding and 
predicting events. Recent work in artificial intelligence has brought causal rea- 
soning into renewed prominence. For example, the much-discussed Yale Shoot- 
ing Problem reveals (according to most diagnoses; see especially Pearl (1988)) 
the absolute necessity of recording and using information about the causal links 
between the bits of information we have about the world. 

Attempts to explain away causation or to replace it with some purely sta- 
tistical regularity (whether or not supplemented by some kind of psychologistic 
decoration) have proved to be catastrophic failures. Every attempt to explain 
causal direction (surely one of the most fundamental features of causality) in 
terms of the nomological-deductive model has failed. Such models of causality 
have generated paradoxes far more rapidly than ad hoc solutions can be invented 
for them. 

If a robust sense of reality leads us to recognize causal connections as first- 
class citizens of our ontological inventory, we must also make room for those 
special kinds of objects that can serve as relata for causal relations, whether 
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we call these objects possible ‘facts’, ‘situations’, or ‘states of affairs’. These 
objects must be distinguished from propositions and from quasi-linguistic rep- 
resentations if we are to capture accurately the logical relations governing causal 
idioms. The restoration of such fact-like entities to respectability has also been a 
common theme of recent work in philosophy, including philosophical linguistics 
and the Stanford situation theory of Barwise and Perry (1983). 

The project of building a unified theory of intentionality and knowledge 
in causal (or teleo-causal) terms faces a major obstacle: accounting for our 
knowledge of modal facts, i.e., facts about necessity and possibility (including 
logical and mathematical modality), about counterfactual conditionals, about 
objective chance or propensity (as a generalization of objective modality), and 
about physical or natural necessity as embodied in natural laws. This obstacle is 
a generalization of the problem Paul Benacerraf (Benacerraf (1983a), Benacerraf 
(1983b)) has raised in the case of mathematics: how is definite reference to 
and substantive knowledge about mathematical objects possible, given that our 
best theories of reference and knowledge involve causal connections between our 
thoughts and their targeted aspects of reality? Benacerraf’s problem generalizes 
to our thought about the laws of nature, about the objective chances of certain 
kinds of events in certain situations, and about various kinds of possibility and 
necessity. In each case, we seem to have intentional reference to and knowledge 
of things that the philosophical tradition has long considered to be causally 
inert. 

Overcoming this obstacle calls for a revolutionary rethinking of our standard 
picture of causation. This standard picture I call the horizontal or erternalist 
model of causation. The alternative I am proposing is the thesis of causal 
internalism, which countenances the reality of vertical causation. 

On the standard, horizontal model, causes and effects are, exclusively, phys- 
ical, spatiotemporally local states and occurrences. The causal nexus, whether 
it consists in a kind of necessary, stochastic, or nomic connection, stands out- 
side of both the cause and the effect. This is why I call it causal externalism: 
the causal nexus is wholly external to both the cause and the effect. The hori- 
zontal/externalist model can account for our knowledge of occurrent properties 
realized in spatiotemporal locations, but it leaves the entire realm of modality 
causally, and, therefore, cognitively and epistemically, inaccessible. 

My alternative proposal is that we consider the modal (or nomic or stochas- 
tic) facts that tie the cause to the effect to be internal to the cause or to the 
effect. Depending on the details of one’s account of causation, causes necessitate 
or probabilify or possibilify their effects. On an internalist model, the fact that 
a given cause necessitates its effect is itself an integral part of the total cause, 
not something that stands outside or above the cause-effect pair. Consequently, 
modal facts are every bit as causally efficacious as are occurrent physical facts, 
and so there is no barrier to providing a unified, causal theory of all of human 
thought and knowledge. For instance, we can think about and gain knowledge 
of natural laws by virtue of the fact that each of these laws enters into some, 
but not all, causal connections. When we observe a regularity (like the ellipti- 
cal orbits of the planets) that is really caused by a particular nomic fact (like 
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the law of gravitation), then our observations provide us with intentional and 
epistemic contact with that nomic fact. 

Here are some of the more significant claims that I make in part I concerning 
the nature of causation: 


1. 


The causal nexus is not something above and outside the cause and effect 
but consists of facts wholly internal to the cause and the effect. This thesis 
of causal internalism commits me to the existence of vertical causation 
from modal and nomic facts to ordinary spatiotemporal ones, crucial to 
giving a unified, causal account of intentionality and knowledge. 


. Modal facts exist, including facts of logical and mathematical necessity, 


and these facts are not reducible to or supervenient on the occurrent facts 
of the world (including its merely actual regularities). The existence of 
logical types (negations, conjunctions, disjunctions, etc.) of arbitrary com- 
plexity is a substantive fact about the world. 


There are compelling reasons for rejecting a strong version of determinism, 
reasons that are independent of the problem of free will (chapters 4 and 
5). 


Only actual situations exist, but in constructing models for modal logic, 
it is convenient to introduce the fiction of merely possible and even im- 
possible situations. 


I propose a new solution to the problem of the scope or extent of causation, 
namely, that every wholly contingent state has a cause. On the basis of this 
principle, I demonstrate the existence of a necessary first cause (chapter 


8). 


It is possible to give a principled basis for a defeasible or nonmonotonic 
logic that incorporates causal information. This logical calculus (devel- 
oped in appendix B) generates rich and plausible conclusions about prob- 
able consequences of known or hypothesized states. 


My theory of causation is designed to provide an exact, mathematical model 
that satisfies the following aims: 


iL 


Causal connections and order should be defined without reference to space 
and time, permitting the construction of a non-circular, causal theory of 
spacetime. 


It should permit the possibility of higher-order or vertical causal connec- 
tions, in order to explain logical and mathematical knowledge, mind/body 
interaction, and the nature of teleofunctions. 


It should provide natural explanations of the formal properties of causation 
and causal explanation, including transitivity, asymmetry, and veridical- 
ity. 
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4. It should match the data provided by intuitions about the validity and 
invalidity of various forms of causal reasoning. In particular, it should 
explain the failure of substitution of classical equivalents in causal contexts 
(see chapter 3), and our default assumption of the universality of causal 
explanation (chapter 8). 


5. It should be able to navigate successfully through the complexities of the 
relationship between causality on the one hand, and modal and statisti- 
cal relations on the other. It should not treat causation as a primitive, 
with no intrinsic relationship to correlation or necessity, but it must avoid 
the paradoxes that have resulted from attempts to reduce causality to 
statistical relations. 


6. It should be compatible with indeterminism, and with merely probabilistic 
connections between cause and effect (chapters 5 and 6). 


7. It should provide an account of the modularity (or locality) of causal 
reasoning: the role (recently much investigated by researchers in the field 
of artificial intelligence) of causation in enabling us to draw correct default 
conclusions in the presence of irrelevant information (appendix B). 


The last desideratum is especially important, since any theory of causation 
that does not account for the special virtues of causal reasoning is seriously 
incomplete. Researchers in logic and artificial intelligence, such as Judea Pearl 
(1988), have discovered that reference to causal relations plays an indispens- 
able role in our commonsense reasoning about the world. The Yale Shooting 
Problem of McDermott and Doyle (which I discuss in appendix B) is an ex- 
cellent example of the sort of problem of reasoning about prospective change 
that requires a causally informed description of the situation. I argue that the 
fundamental characteristic of causality that explains its importance in common- 
sense reasoning is the Markov property: when one fact is causally screened off 
from a second by one of its causes, then the conditional probability of the sec- 
ond on the cause is independent of the first fact. This justifies our exclusion of 
causally irrelevant information (information that is causally screened off from 
our prospective conclusions by our premises) in reasoning defeasibly. 


1.5 The Ontology of Causation 


In order to make sense of causal relations, we must be able to apply the part- 
of relation (and the associated machinery of mereology) to the causal relata. 
This means that we must acknowledge the reality of concrete existences, to- 
kens, that can play the role of concrete events and states (or “situations”). In 
addition to these situation-tokens, we will need abstract, repeatable situation- 
types. The situation-types represent intrinsic qualities or characters of situation- 
tokens. This choice of primitives is drawn from the work of Barwise, Perry, and 
Etchemendy (Stanford situation theory). 
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The situation-tokens can serve as the truth-makers for propositions, play- 
ing the role that “facts” play in the philosophies of Austin, Bergmann, and 
Hochberg. When it is true that the cat is on the mat, there is a concrete 
cat-on-the-mat situation-token s that makes it true. This token s is of the 
cat-on-the-mat type. 

Complex situation-types can be constructed from simpler ones by means 
of logical operators, such as negation and disjunction. These operators should 
be interpreted by means of the strong Kleene three-valued truth tables or the 
four-valued Dunn tables (as explained in appendix A). 

In addition to tokens and types, there is a causal priority relation <, a strict 
partial ordering (transitive, irreflexive, and asymmetric) of situation-tokens. If 
s < s’, then s is qualified to act as part of a cause of s’. Intuitively, we can 
think of s ~ s’ as meaning that s is wholly in the backward time cone of s’. 

In chapters 5 and 8, I advocate the thesis that all of the causal antecedents 
of a token are essential to its identity: if any of them had failed to exist, the 
token itself could not have existed. If we accept this thesis, then we can define 
the causal priority relation in this way: s ~ s’ if and only if s and s’ do not 
overlap mereologically (that is, they have no part in common), and no part of 
s’ could exist unless s existed. 

There are two notions of causation that I define: (1) total causation (s is a to- 
tal cause of s’) and (2) INUS causation. INUS causation refers to J. L. Mackie’s 
account of a cause as an insufficient but necessary part of an unnecessary but 
sufficient condition for the effect (Mackie (1965)). Both total causation and 
INUS causation introduce a modal or statistical element: a total cause must 
make its effect conditionally necessary, or at least, conditionally much more 
probable than it would otherwise be. An INUS cause is an indispensable part 
of some total cause: s is an INUS cause of s’ just in case there is a total cause 
s" of s’, s is a part of s”, and any part of s” that does not contain s as a part 
is no longer a total cause of s’. 


1.6 The Need for an Indeterministic Model 


In chapter 4, I develop a deterministic model of causation, one in which a 
total cause necessitates its effect. However, I discover a number of independent 
reasons for being dissatisfied with such a model: 


1. We have clear intuitions that causation should be possible in an indeter- 
ministic world. 


2. If causes necessitate their effects, and effects necessitate their causes (since 
the identities of their causes are essential to their own identities), then 
causes and effects would be modally inseparable. 


3. When applied to specific examples, the necessitation model over-generates 
causal connections and inflates the minimal content of causal explanations. 
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There are several difficulties that pose serious problems for building an inde- 
terministic model of causation, however. First of all, verifying the transitivity of 
causation is no longer trivial, once we abandon strict necessitation as the stan- 
dard. Verifying the veridicality of causation is also non-trivial. In addition, mere 
probabilistic relevance is neither necessary nor sufficient, as is demonstrated by 
two kinds of cases: (1) causes with no or even with negative statistical rele- 
vance to their effects, and (2) pre-empted causes, preconditions with positive 
statistical relevance that are nonetheless not causes, because some independent 
factor preempts their operation. Finally, there is the Markovian independence 
principle that I mentioned above, which is critical to explaining the modularity 
of causal reasoning, but which is also difficult to secure in an indeterministic 
setting. In chapter 6, I use Lewis/Stalnaker conditionals in a novel way to 
overcome these difficulties. 


1.7 A Causal-Probabilistic Theory 
of Information 


My teleological account of mental representation depends crucially on being 
able to define information without reference to mentality or teleofunctionality. 
In order to do this, I borrow heavily from the work of Fred Dretske (1981), in 
which information is defined by means of objective probabilities. According to 
Dretske, a fact p carries the information g just in case the conditional probability 
of g on p is equal to 1, which Dretske interprets as meaning that p necessitates 
q. 

The principal difficulty with such an account is that of accounting for the 
possibility of error or misinformation. If p carries the information that q, then 
it is impossible for p to be true and q false. There are two popular solutions to 
this difficulty, neither of which is really satisfactory. We could require only that 
the conditional probability of g on p be within some small, finite interval of 1, 
or we could require only that the conditional probability of g on p be higher 
than that of g on ~p. However, if we do either of these, we lose the validity 
of the Xerox principle, the principle that information is transitive: if p carries 
the information that g, and q carries the information that r, then p carries the 
information that r. 

A second popular strategy (adopted by Dretske himself) is to add some 
condition N, representing normal or canonical training conditions, and require 
that the conditional probability of q on the conjunction p& N be equal to 1. 
These normal conditions are usually specified retrospectively, by reference to 
some salient, historical facts. In chapter 9, I argue that these retrospective 
strategies are inadequate, and I propose two alternative solutions, one using 
infinitesimal probabilities and the other conditional functions. 

A token s carries the information that p robustly in world w just in case 
every part s’ of w that contains s as a part carries the information that p. This 
means that s carries the information that p, and every extension of s in w also 
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carries this information. Robust information is the pre-cognitive analogue of 
knowledge. When one knows something on the basis of robust information, one 
is immune to Gettier-like counterexamples. 


1.8 Why an Exact Theory? 


A formal or exact theory is an attempt to use logic and mathematics to represent 
a conception (or family of conceptions) of a particular subject matter. For 
example, Newtonian mathematics involved the use of the calculus to represent 
a conception of the physics of motion. 

Formal or exact metaphysics should not be thought of as the analysis of 
concepts, or as a branch of pure logic. Nor should it be identified with the artic- 
ulation of our commonsense worldview (the conception of the world ensconced 
in ordinary language and everyday practice), although metaphysics typically 
begins with this task. 

An exact theory of a metaphysical subject, such as causation, is an attempt 
to express our best, most-educated guesses about the truth of the matter in a 
form that is as falsifiable and corrigible as possible. The alternative to develop- 
ing an exact theory is operating with an undisciplined miscellany of hunches and 
intuitions, poorly defined and changing unsystematically as one moves from one 
sphere of application to another. Without an exact theory, inconsistency is very 
difficult to detect. Unanticipated consequences are rarely discovered, and one’s 
reasoning is often afflicted with non sequiturs and unintended equivocation. 

The task of defining and investigating an adequate formal language for rep- 
resenting causal reasoning remains unfinished. Recent work by Pearl (1988), 
Pearl and Verma (1991), and Spirtes et al. (1993) is suggestive but limited, in 
that all this work takes the relation of causation to hold among a fixed enu- 
meration of dynamic variables. However, in ordinary causal reasoning, we often 
take complex facts and events to be causal factors. In part I, I define a formal 
language for causal reasoning that is capable of treating facts of arbitrary com- 
plexity as causes and effects, and of resolving many of the outstanding logical 
puzzles. 

I am confident that the theory of causation that I develop in part I is clear 
and precise enough to be falsifiable. Where it goes wrong (as I’m sure in many 
places it does), it should be possible to construct clear counterexamples, either 
from real life or from imagination, accompanied by strong intuitions of real 
possibility. 

The subject of causation has experienced a renaissance in analytic philosophy 
over the last generation. Theories and arguments involving causation proliferate, 
in epistemology, philosophy of mind, philosophy of science, and philosophy of 
language. However, few working in these areas have attempted systematic and 
exact accounts of causation, and no such account, to my knowledge, is directly 
relevant to as broad a range of outstanding philosophical problems as is the 
account presented here. 
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1.9 The Big Picture: Preview of Part II 


In this book, I develop a theory of causation, and I apply this theory to a large 
number of outstanding problems in philosophy, including such topics as: 


e The definition of proper function (teleology) 

e The semantics of mental representations 

@ The mind/body problem (including free will) 

e The causal basis for logical and mathematical knowledge and cognition 
e The problem of induction (including Goodman’s puzzle) 

e Enduring substances and their identity-conditions 

e The construction of space and time 

e The objectivity of values and moral norms 


Obviously, I cannot do justice to the vast literature on any one of these topics. 
However, in each of these topics, the concept of causation plays a central role, 
and I cannot claim to have developed an adequate theory of causation without 
at least beginning the task of testing my theory against the data provided by 
each of these problem areas. For this reason, I have been forced to cast my net 
very broadly. 

I do not pretend to have said anything dispositive on any of these subjects 
in this book, but I do believe that the novel account of causation that I develop 
here enables me to make a genuinely original contribution in each case, one 
that I hope will stimulate further discussion. In each case, confusion about the 
nature and conditions of causation have produced an impasse. The introduction 
of an exact account of causation, together with the development of some novel 
proposals, may help to move the discussion to more fruitful ground. 

The overall structure of the project goes something like this. The theory of 
causation and information (developed in part I) is used to construct a theory of 
teleofunctionality as a form of higher-order causation (chapter 12), and an ac- 
count of the causal efficacy of logical and mathematical facts (chapter 15). After 
a survey of recent accounts of mental representation, I combine my theories of 
information and teleology, resulting in an account of the semantics of mental 
representations (chapter 14): a mental representation carries the content p just 
in case it has the teleofunction of carrying the information that p. The theory 
of mental representation is then used in developing theories of mind/body inter- 
action, qualia, and free will (chapter 16), and knowledge and induction (chapter 
17). I develop a causal/teleological theory of enduring substances and their 
identities through time in chapter 18. Both the theory of teleology and that of 
mental representations are used in the development of a eudaemonistic theory 
of ethics (chapter 19), which in turn is used in sketching an account of moral 
realism (chapter 20). 

Here are some of the more significant claims that I make in part II: 
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Figure 1.1: Overview 


. There is a tight connection between the semantics of belief and epistemol- 


ogy: once we have the semantics right, the theory of knowledge is merely 
a corollary (chapters 14 and 17). 


. There are powerful reasons for rejecting materialism (which I take to in- 


clude, at a bare minimum, the limitation of causal relations to spatiotem- 
poral items), reasons that are independent of the well-known problems in 
the philosophy of mind (see chapter 21 for a summary of these reasons). 


. A simple, causal theory of mathematical thought and knowledge is possi- 


ble, one that unifies the theory of mathematical knowledge with that of 
empirical and scientific knowledge (chapter 15). 


. Taking functions seriously leads to a very robust form of ethical realism, 


one that does not identify objectivity with some sort of idealized subjectiv- 
ity but instead revives the eudaemonism of Plato and Aristotle (chapters 
19 and 20). 


. The use of the mereology of events and of non-classical (three- and four- 


valued) interpretations leads to more sophisticated conceptions of super- 
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venience, type identity, and token identity than were available heretofore. 
These more sophisticated conceptions enable us to solve the problem of 
mental causation (chapter 16). 


My aim in this book is to bring an end to the dualism that has dogged 
philosophy since the downfall of Aristotle’s metaphysics (including his “meta- 
physical biology”) at the beginning of the modern era. Commentators such as 
Leo Strauss, Alisdair McIntyre, and John McDowell have all located the roots 
of the dualisms of mind and body, of fact and value, and of objectivity and 
subjectivity, in that early modern separation of scientific fact and normativity. 
In my view, the early modern turn away from Aristotle has been both unneces- 
sary and disastrous. Aristotle’s “metaphysical biology” is more viable in light 
of modern knowledge than it has ever been, and the recognition of this fact can 
bring about a great reunification of our view of the world. 

At the same time, I will argue staunchly against a false reunification built 
upon a narrow physicalism. Physicalists have been right to insist that our 
knowledge of the real cannot extend beyond the network of causation. They 
were right, therefore, to challenge the viability of positing a subjective and 
normative realm beyond the reach of science. However, they were wrong to 
think that science teaches us that only physical states, states located within 
the framework of space and time, can be causally efficacious. In fact, science 
provides abundant evidence, albeit implicitly, of the causal efficacy of physical, 
mathematical, and logical modality. 

There is no need to read the chapters of this book in strictly sequential 
order. In fact, I expect few readers to be interested in all of the topics covered. 
For example, if you have little interest in logic or in formal theories of causal 
reasoning, you can skip appendixes A and B altogether, without doing damage 
to your comprehension of the rest of the book. If you don’t care about learning 
the ins and outs of the metaphysics of causation, then I would recommend 
giving part I only a cursory reading and getting into the applications in part IT 
as quickly as possible. You could go directly to part IT, referring back to part I 
only as needed (I hope the cross-references, the index, and the table of contents 
will give you all the guidance you need). If you would like to read just enough of 
part I to grasp the outlines of my account of causation, I would suggest reading 
chapters 3, 4 (especially 4.1 through 4.8), and 9, while skipping the technical 
material, such as the proofs and detailed examples. 

Alternatively, if your interests lie exclusively in the field of philosophical 
logic or theories of causation, there is no reason for you to read part II at all. 
In addition, you should feel free to jump around within part IT all you wish: 
the order of the chapters is not essential. My only recommendation would be 
for you to read chapter 12 before reading chapters 14, 16, 17, 19, or 20, and to 
read chapter 14 before 16 or 17. 
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1.10 A Glossary of Symbols 


Although this book contains a considerable number of formulas of symbolic logic, 
the meanings of the formulas are nearly always spelled out in plain English. 
There are a few logical and mathematical symbols that the reader must be 
familiar with: 


Logical Symbols 


e — represents negation, “it is not the case that. . .” 


e V represents inclusive disjunction, “either... or... (or both).” 
e & represents conjunction, “both... and... ” 

e — represents a conditional, “if. .., then...” 

e + represents the biconditional, “. . . if and only if. . .” 


e Vz represents universal quantification, “every object x is of such a kind 
that...” 


e sr represents existential quantification, “there is at least one object x of 
such a kind that. . .” 


e C represents necessity, and © represents possibility. 


e O— represents a non-truth-functional conditional: (¢0— ) means that 
w is extremely probable (objectively speaking), conditional on ¢. These 
conditionals warrant defeasible inferences. 


e ¢|t/z] represents the substitution of x by t throughout formula ¢. 


e Pr(A/B) represents the conditional probability of A on B. 
Metalinguistic Symbols 


e — represents the relation between a token (or a token in a model) and a 
type relative to a model, where M,s | ¢ is true just in case s supports 
type ¢ (according to model M). In accordance with standard mathemat- 
ical practice, I also sometimes use the / symbol to represent the relation 
of logical consequence or implication between formulas or propositions 
(especially in appendix A). 


e |* represents the relation of nonmonotonic or defeasible consequence, 
defined in appendix B. 


e | is used in representing the inference rules of a logical system. The 
symbol H represents a two-way, or reversible, inference rule. 
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e |l¢{[, ||¢|| represent the interpretations of symbols t and ¢ in the model 
under consideration. 


Set Theoretic Symbols 


e €, C represent membership and subset, respectively. 
e U, M represent union and intersection. 


e @ is the empty set. 


R[{A}] is the image of A under relation R, that is, the set of all of the 
objects that are related by R to something in A. 


In addition to these familiar symbols, I will make use of a significant number 
of special symbols. These are all introduced at appropriate places in the text, 
but I have assembled them all here as well, for the sake of later reference. 


Symbols of Mereology 

e C represents the non-strict part-to-whole relation (everything bears this 
relation to itself). 

e C is the symbol for proper parthood (asymmetric). 

e LJ and M represent mereological union and intersection, respectively. 

e © represents mereological overlap (having a part in common). 

e i¢ represents the mereological sum of all the things that satisfy the open 
formula ¢. 

Special Primitive Symbols 

e As represents the actuality of situation s (its being part of the actual 
world). 


e |= is used to form a higher-order type by conjoining a situation-token and 
a type, i.e., the expression (s|= ¢) represents the type that is realized by 
any token s’ whenever s supports the type ¢. |= is an object-language 
counterpart to the metalinguistic F. 


e ~ represents the relation of causal priority. (This is primitive in chapter 
5, but definable according to the model built in chapter 6.) 
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e ~, represents immediate causal priority: s <9 s’ just in case s is prior to 
s’, and there is nothing intermediate between any part of s and any part 
of s’. 


e > represents the total cause relation: s > s’ if s is a total cause of s’. 


e ~ stands for causation in the sense of Mackie’s INUS condition: an insuf- 
ficient but necessary part of an unnecessary but sufficient condition. I also 
use this symbol to represent the closely related idea of causal relevance of 
one fact to another. 


e |~ represents the relation of causal constraint between types. 


« N stands for the immediate causal succession relation: sNs’ means that 
s’ is the mereological sum of all the situations immediately posterior to s. 


e => and + represent the simple and robust carrying of information. 


e The expression (s : ¢) is used to represent an ordered pair consisting of 
a situation s and a type ¢. These ordered pairs are typically used to 
represent actual or possible facts. 


Part I 


A Theory of Causation 
and Information 


2 


Toward a Unified Theory 
of Causation 


The literature in the twentieth century on causation is vast and complex. I 
will give here only a cursory survey of it, with the aim of locating the elements 
that I have appropriated into the formal theory developed in the rest of part I. 
My main objective has been to unify the theory of causation in such a way as 
to provide something useful to philosophers of science, researchers in artificial 
intelligence, and philosophers of mind and intentionality. 

The main division within recent work in causation comes between those 
who have focused on causal relations between event-types and those who fo- 
cus on relations between event-tokens. An integrated account of both of these 
sets of relations is much needed. The focus on event-types typifies the broadly 
Humean tradition, including the deductive-nomological model, statistical theo- 
ries of probabilistic causation, and Mackie’s INUS account. In contrast, David- 
son, counterfactual accounts like those of David Lewis, branching-time theorists 
like Kutschera, singularists like Nancy Cartwright, Michael Tooley, and David 
Armstrong, and ontological-linkage theorists like Wesley Salmon, James Fair, 
Phil Dowe, and Douglas Ehring all place primary emphasis instead on the oc- 
currence of concrete event-tokens. 

In my account, I try to give equal justice to both the token and the type 
levels. My account is essential a modal account of causation, and modal rela- 
tions, like those of conditional necessity or objective chance, can hold as well 
between token events as between event-types. My framework enables me to be 
neutral on the question of the existence of singular causation: I can represent 
the possibility of a singular connection between token-events, but nothing in my 
theory commits me to treating this as a real possibility. 
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2.1 The Nomological/Deductive Tradition 


Hume argued that the concept of causation cannot be a primitive, undefin- 
able concept, since we have no sensory acquaintance with the causal relation as 
such. He suggested that we can define causation (or, perhaps, replace it with one 
defined) in terms of regular associations of event-types. The Humean tradition 
takes the task of science to be the discovery of natural laws, certain kinds of reg- 
ularities in the occurrence of event-types. One event causes another if the type 
of the second can be deduced from the type of the first by means of true natural 
laws. Consequently, this model became known as the “nomological/deductive” 
model. 
The nomological/deductive model has run into a number of problems: 


e It has proved impossible to give a satisfactory account of the direction of 
causation, the asymmetry of the cause/effect relation. 


e The relation between causation and time remains an unilluminated mys- 
tery. Typically, there is the bare, unmotivated stipulation that causes 
must precede their effects. 


e There are some difficulties in extending the model to cover probabilistic 
causes and other kinds of indeterminism. 


e Humeans have not been able to produce a plausible account of the dis- 
tinction between natural laws and merely accidental generalizations. 


e There are a number of resistant counterexamples to the model, including 
preempted would-be causes, and the apparent possibility of worlds with 
correlations but no causation whatsoever. 


At the same time, it is vitally important to acknowledge the many virtues 
of the N/D approach. In replacing the model, we must find an alternative that 
subsumes its successes. 


e It provides an explanation for the connection between causation and cor- 
relation. 


e It deals explicitly with causal relations between event-types. 


e It provides a plausible model of causal explanation, drawing on the analogy 
between explanation and deduction. 


2.2 Theories of Probabilistic Causation 


Humean empiricists like Reichenbach, Suppes, Eells, Humphreys, and Skyrms 
have created a very impressive body of work extending the nomological/ 
deductive account to the domain of probabilistic causal theories and statisti- 
cal data. As the work has progressed, we can see a clear movement away from 
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the strict reductionism of Hume and toward an account in which the relation of 
causal relevance or priority is taken as an unanalyzed primitive. The account 
of causation that we find in Skyrms and Eells falls roughly into this pattern: 


C is a positive causal factor for E iff P(E/CH) > P(E/7C8H), 
where H includes all of the causal factors relevant to E', except for 
C itself, and those factors causally influenced by C. 


Notice that this definition does not attempt to give a reduction of all causal 
concepts to merely statistical or probabilistic ones: the relation of being a rel- 
evant causal factor is left unanalyzed. A second feature of the standard prob- 
abilistic approach is its exclusive attention to causal relations at the level of 
types. Very little is said about what it takes for one token to be a cause of 
another. 


2.3. Davidson and Event-Tokens 


Donald Davidson’s work on causation, like earlier work by Anscombe and 
Ducasse, is concerned with causation as a relation between concrete event- 
tokens. Davidson’s approach is resolutely non-reductive, thereby avoiding the 
counterexamples to the deductive/nomological account. 

This attention to tokens and their relations was an important corrective 
to the Humean tradition, but Davidson’s original treatment of event-types was 
seriously defective. Davidson did not distinguish between the intrinsic character 
of an event and arbitrary true descriptions of the event. For instance, the 
intrinsic character of the murder of Caesar includes facts about the number, 
angle, and timing of the knife thrusts. It does not include features mentioned 
in such extrinsic descriptions as: foreseen by Caesar’s wife, the cause of a civil 
war, or the result of Caesar’s high-handedness. However, it is by virtue of their 
intrinsic types that tokens support causal relations. 

Davidson individuates events by including all of the causes and the effects 
in the essence of each individual event. This means that the occurrence of any 
particular event necessitates both its own past and the entire subsequent course 
of history. Davidson was only half right here: the actuality of a particular 
situation-token necessitates the actuality of all causally prior tokens, but not 
that of the causally posterior ones. It is this asymmetry that constitutes the 
fixity of the past and the openness of the future (see section 5.3). 


2.4 Lewis’s Counterfactual Account 


Like Davidson, David Lewis sees causation as primarily a relation between event- 
tokens. Lewis’s theory has the virtue of connecting causation with modal prop- 
erties (like necessity and sufficiency) via his work on the logic and semantics of 
counterfactual (subjunctive) conditionals. 

In brief, Lewis (Lewis, 1986b, pp. 164~167) defines causal dependence be- 
tween tokens x and y in this way: 
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1. If a had not occurred, y would not have occurred. 
2. If x had occurred, y would have occurred. 
3. x and y both occurred. 


Condition (1) states that the occurrence of x is necessary, not absolutely 
but in the actual circumstances, for the occurrence of y. Condition (2) states 
that occurrence of x was sufficient (again, in the actual circumstances) for the 
occurrence of y. Lewis defines causation as the transitive closure of causal 
dependence. 

In my view, Lewis’s reliance on counterfactuals to define causation has the 
order of analysis backward. An adequate account of the semantics of counterfac- 
tuals must incorporate causal notions. The work of Stalnaker (1986) and Lewis 
(1973) on the logic of counterfactuals, and on the formal semantics of these con- 
ditionals, is quite impressive and entirely successful. However, more is required 
than a logic and a formal theory of semantics to qualify a concept for founda- 
tional use in metaphysics. A foundational concept must have a unity and fixity 
of reference that I believe counterfactual conditionals, with their sensitivity to 
context and practical interest, lack. 

In addition, Lewis gives no reason for the transitivity of causation but in- 
stead builds this condition into his definition by fiat (by taking the ancestral 
of the causal dependence relation). Furthermore, Lewis cannot guarantee that 
causation is asymmetric, and his account of the directionality of causation seems 
circular. 

Counterfactual accounts rule out a priori the possibility of necessary facts 
acting as causes. It is unclear how to evaluate counterfactuals with impossible 
antecedents, other than treating them all as vacuously true. Hence, any nec- 
essary fact would be, vacuously, a cause of everything, including itself. It is 
an essential feature of my account that necessary facts are well qualified to act 
as causes. This plays a crucial role, for example, in my account of the causal 
connections underlying logical and mathematical knowledge. 

Finally, there are a number of examples of preemption and overdetermination 
that Lewis’s account gets wrong, unless weakly motivated epicycles are added 
(Ramachandran (1997), Noordhof (1997)). For example, if e is caused by c, and 
e itself pre-empts d, and d would have caused e, had it not been preempted by 
e, then e does not depend counterfactually on c, and so Lewis’s account does 
not treat c as a cause of e. In addition, it is difficult to see how Lewis’s account 
can be extended to probabilistic causation, or, in general, to causation in an 
indeterministic world (Menzies (1996)). 

Ramachandran has recently proposed a counterfactual analysis that avoids 
these counterexamples and that resembles the account I give in part I Ra- 
machandran (1997). Ramachandran first defines an M-set of a: 


S is an M-set for a iff S is a minimal set such that if none of the 
members of S had occurred, a would not have occurred. 


Ramachandran then defines cause in terms of M-sets: 
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c is a cause of e iff c belongs to an M-set for e, and there are no 
M-sets R and S for e such that R contains c and S differs from R 
only in containing one or more non-actual events in place of c. 


Ramachandran’s definition (like Mackie’s INUS condition, to be discussed in 
the next section) is an attempt to formalize the fact that a cause is necessary in 
the actual circumstances for its effect. A fatal flaw in Ramachandran’s definition 
lies in the definition of M-sets. The minimality of an M-set is defined solely in 
terms of set membership: there is no proper subset meeting the counterfactual 
condition. Nothing prevents c from belonging to an M-set even though c itself 
contains (as parts) totally irrelevant, and even causally posterior, sub-events. 
The mereological theory of event-tokens developed in chapter 3 is needed in 
order to define the appropriate form of minimality. 

There are two further shortcomings to the counterfactual account of causa- 
tion. First, the account does not provide any guidance to the use of causal facts 
in prediction and explanation. We must already know what would and would 
not happen under various hypothetical situations before we can apply causal 
descriptions to the situation. Causal concepts are of no use in deriving these 
counterfactual relations. On my account, as delineated in appendix B, in con- 
trast, causal information is critical to the task of prediction and counterfactual 
projection. Hence, it is valuable to have a characterization of the causal relation 
that does not presuppose complete knowledge of counterfactual connections. 

Second, neither Lewis nor Ramachandran offer anything like a complete 
account of the principles of event identity. They rely on our somewhat woolly 
intuitions on a case-by-case basis. My account of token causation includes an 
explicit and precise account of the identity conditions of event-tokens (section 
4.3.5). 


2.5 Mackie’s INUS Conditions 


In his essay “Causes and Conditions” (Mackie (1965)), J. L. Mackie introduced 
the idea of an INUS condition. An INUS condition is a condition that is an in- 
sufficient but necessary part of an unnecessary but sufficient condition for some 
event-type. Mackie was working in the broadly Humean, empiricist tradition, 
and, consequently, his primary concern was with relations between situation- 
types. However, he was beginning to see the importance of relations between 
tokens, and of distinguishing clearly between event-types and event-tokens. In 
fact, the INUS idea works better than Mackie himself realized when it is trans- 
ferred to the setting of tokens. We can say that one token a is an INUS condition 
for another token b when a is an indispensable part of a token c whose occur- 
rence is sufficient for the occurrence of b. By “indispensable part of c,” I mean 
that no part of c that does not contain a is sufficient for b. This notion of INUS 
condition illuminates much of our natural-language discourse about causation 
(as I argue in chapter 3). 
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2.6 Yablo’s Theory 


Stephen Yablo (1992), while working on a theory of mental causation, develops 
a theory of causation that bears some resemblance to my own. He also works 
with an ontology of event and state tokens, with a relation, subsumption, that 
is related to the part-of relation that I employ. Essentially, a state s subsumes 
s', s > s’, just in case s’ is a part of s (s’ C s) and s and s’ are coincident. 
Two tokens are coincident when they occupy the same spatiotemporal location. 
Thus, Yablo’s theory only applies to tokens with spatiotemporal location. In 
addition, Yablo’s theory cannot be used to give a causal definition of spacetime, 
since it presupposes spatiotemporal relations in its formulation. 

The essence of a token corresponds to the set of intrinsic types supported by 
the token. Yablo assumes that all such types are persistent, in the sense that. if 
s Cs’ and s is of type @, then s’ is also of type ¢. This means that if s CE s’, 
then the essence of s is a subset of the essence of s’. In Yablo’s terminology, if 
s subsumes 3’, then the essence of s’ is a subset of the essence of s. 

Yablo uses counterfactuals to define two preconditions of causation: contin- 
gency and adequacy. 


Contingency (~OcO- -Oe) 
Adequacy (=OcO— (OcO- Oe)) 


These conditions are very similar to those used by Lewis in his counterfac- 
tual definition of causation. Where Yablo differs from Lewis is his use of the 
subsumption relation to capture a version of Mackie’s INUS condition. A token 
c is required for e just in case for every proper part c’ of c, if c’ had occurred 
without c, then e would not have occurred. Yablo’s condition of requirement 
can be thought of as a refinement or clarification of Contingency, since it tells 
us that in testing whether Oe would occur on the assumption of ~Oc, we must 
consider every possibility in which some proper part of c occurs but c itself does 
not. Requirement guarantees that every part of c is necessary for the occurrence 
of e, under the circumstances. Any state token that is a part of such a required 
token will be an INUS condition of the effect, since it will be an indispensable 
part of a mereologically minimal adequate (quasi-sufficient) condition. 

Yablo’s use of unanalyzed counterfactuals burdens his account with the same 
deficiencies that characterized Lewis’s and Ramachandran’s accounts. 

In applying his theory to mental causation, Yablo reveals that his conception 
of event-tokens is more abstract than that of Davidson or Lewis. Apparently, 
there are logically “impoverished” tokens corresponding to each concrete occur- 
rence. For example, if there is a token of John walking, there is also a distinct 
token of John walking or Jane whistling, with the first token subsuming the sec- 
ond. In addition, there are also distinct tokens of someone walking and of John 
doing something. This leads to a very extreme multiplication of entities. On 
my alternative model, any token that realizes some genuinely disjunctive type 
must realize one or the other of the disjuncts. This corresponds to thinking of 
each token as a concrete part of the world. 
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2.7 Branching-Time Models 


Work by Belnap (1987), McCall (1976), and von Kutschera (1993) builds on 
the branching-time models of temporal logic. For example, in von Kutschera’s 
theory, an event a is a cause of b just in case it is the first event whose occurrence 
guarantees the occurrence of b. In all these models, of course, temporal relations 
are taken as primitive, and causal order is parasitic upon temporal order. This 
makes time travel and backward causation absolutely impossible. It also blocks 
the option of giving a causal theory of spacetime. 


2.8 Artificial Intelligence and Models 
of Causal Inference 


Judea Pearl (1988) and his colleagues at UCLA have made considerable progress 
in recent years in two areas: the theory of causal inference (inferring causal 
structures from statistics, without prior information about causal or temporal 
priority), and the role of causal notions in defeasible, commonsense reasoning. 
Even more recently, Spirtes et al. (1993), building on Pearl’s work (as well as 
the Reichenbachian tradition), have developed workable algorithms for a well- 
defined program of causal inference. 

Throughout this work, Reichenbach’s notion of ‘screening off’ plays a central 
role. Roughly, if one factor screens off a second from one of its effects, then the 
conditional probability of the effect on the cause is independent of the screened- 
off factor. This principle is also known as Markov’s rule, after the famous 
Russian mathematician. 

Another principle that plays a crucial role in the theory of causal inference 
as developed by Pearl and by Spirtes, Glymour, and Scheines is Occam’s razor. 
This work includes a rigorous definition of the relative simplicity of a causal 
hypothesis. 

There are two main deficiencies in this body of work: first, it does not make 
a clear distinction between tokens and types, and, second, it deals only with 
logically simple factors. No work has been done to date on extending these 
ideas to types of arbitrary logical complexity. In appendix B, I develop a causal 
calculus that builds on this tradition and rectifies these two shortcomings. 


2.9 Tooley and Cartwright 


In Causation: A Realist Approach, Michael Tooley (1987) subjects the Hu- 
mean tradition to a barrage of cogent objections. Tooley demonstrates that 
singular causation (causation between tokens) does not supervene upon the 
causal laws and non-causal facts of the world, an insight that I incorporate 
into my own account. Tooley’s positive account treats causation as a relation 
between properties (universals). I agree with Tooley on the need for treating 
universals or types as first-class members of our ontology. 
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In Nature’s Capacities and Their Measurement, Nancy Cartwright (1989) en- 
dorses and defends two theses that I have incorporated into my account. First, 
she argues, with Tooley, that singular causation is irreducible to type-level rela- 
tions. Cartwright uses the example of the complexity of the causal relationship 
between the use of the birth-control pill and thrombosis to demonstrate the 
priority of token-level causation (Cartwright, 1989, p. 99). In general, the use 
of the pill lowers the probability of thrombosis, by lowering the probability of 
pregnancy, which is a positive causal factor for thrombosis. However, in many 
cases, the use of the pill causes thrombosis directly. The truth of causal gener- 
alizations must be sensitive not only to statistical relationships among classes 
of events, but also to the presence or absence of token-level causal relations. 
Second, Cartwright insists that all causal generalizations are defeasible or ex- 
ception permitting. This latter insight plays a crucial role in my indeterministic 
model of causation in chapter 5. 


2.10 Process and Linkage Theories 


In recent years, a number of theories of causation have been proposed that 
forthrightly insist that there is a real connection between causes and events 
at the token level. This linkage is to be understood as an irreducible element 
of reality. On Wesley Salmon’s account Salmon (1998), causes and effects are 
connected by something called a process. David Fair (1979) proposes that the 
linkage consists in the transfer of energy; for Phil Dowe (Dowe (1992), Dowe 
(1995)), it consists in the transfer of some conserved quantity, and for Douglas 
Ehring (1997), in the transference of a property trope. 

I agree with all these accounts in thinking that a real, non-Humean linkage 
between token cause and token effect is needed. However, I locate this linkage 
in a modal connection: the asymmetric necessitation of the token-cause by 
the token effect (see section 5.3). The main difficulty with all of the other 
ontological-linkage theories is that they are too narrow. They each cover some 
but not all cases of genuine causation. Salmon’s process account, for example, 
cannot handle cases of causation by absences, and it dogmatically rules out 
the possibility of action at a distance, despite the fact that quantum mechanics 
seems to require it (as Salmon himself concedes (Salmon, 1998, p. 231 n. 19)). 

A second drawback to the ontological-linkage theories is that they tend to 
be unilluminating. It would seem that to be able to distinguish between genuine 
processes and pseudo-processes, we must make use of an unilluminated concept 
of causation. The same is true for distinguishing cases of the genuine transfer 
of energy or charge or some trope from cases of mere coincidence of identical 
quantities or tropes. 

Thirdly, ontological linkage theorists believe that causation is always a purely 
local, intrinsic fact involving only the causally connected particulars. However, 
there a number of clear counterexamples to this claim of intrinsicality. For 
example, there are cases of double prevention: cases in which A causes B by 
preventing the occurrence of a potential preventer of B. An escort fighter could 
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participate in causing a successful bombing raid by shooting down interceptors 
that otherwise would have shot down the bombers. In such cases, there is no 
single, compact process to which the causal connection is an intrinsic feature. 
That the sending of the bomber was a cause of the ultimate damage depended 
on the extrinsic presence of the fighters, and, similarly, that the action of the 
fighters was a cause of the bomb damage depended on the extrinsic presence of 
the bombers. 

Finally, ontological linkage theories require that we make use of a primitive 
relation of identity over time for the conserved quantities or tropes. In contrast, 
I want to give a causal account of all identities through time, and so these 
conserved-quantity and conserved-trope theories of causation are of no use to 
me. 


2.11 Mellor’s Theory 


In a recent book, D. H. Mellor analyzes causation as a relation between facts 
that involves the modal relation of objective chance Mellor (1995). One fact 
causes another just in case it increases the objective chance of the second. My 
own account of causal explanation or fact/fact causation in sections 4.5 and 
5.4.2 is quite close to Mellor’s. The main differences are these: 


e Mellor takes fact /fact causation to be more fundamental than token/token 
causation, while I take the two to be equally fundamental. Mellor bases 
his position on the fact that absences or negative facts can act as causes 
and as effects. I concur with this assumption, but I would insist that these 
negative causes and effects are never pure absences: they always involve 
the supporting of some negative property by some situation-token at some 
determinate position in the causal network of the world. Consequently, 
wherever we have causation by or of absences, we also have instances of 
token-level causation. 


e Although Mellor insists on the possibility of higher-order or iterated cau- 
sation (Mellor, 1995, p. 108), he never explains how this causation is pos- 
sible on his account, since it would involve higher-order objective chance, 
a problematic notion (as I demonstrate in section 7.1). 


e Mellor accepts the substitution of classically equivalent sentences within 
causal contexts, while I argue in chapter 3 that only strong-Kleene equiv- 
alents may be so substituted. 


e Mellor defines causation in terms of the raising of the objective chance of 
the effect (as compared with some background level), which faces a number 
of counterexamples, as I discuss in section 6.5. My own account makes use 
of Mackie’s INUS conditions and the mereological relations among tokens, 
avoiding these counterexamples. 
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e Mellor denies the existence of negative or complex properties, while such 
properties play a crucial role in my account of teleology, our knowledge of 
logic and mathematics, and mental causation. 


e Causal laws play a central role in Mellor’s account of causation, but he 
never provides an account of the semantics or logical form of such laws. 
In contrast, I provide an explicit account of the logic and semantics of the 
modal constraints that I use in elucidating the nature of causation. 


2.12 Accounts of Causal Asymmetry 


A basic feature of causality is its directionality and asymmetry; the relation be- 
tween a cause and its effect is different from the relation between an effect and 
its cause. There have been four predominant accounts of this asymmetry: the 
appeal to fork asymmetry, the appeal to entropy, the appeal to human agency 
and manipulability, and the appeal to time. The first of these was pioneered by 
Hans Reichenbach (1956) and has been recently defended by David Papineau 
(1992). Fork asymmetry refers to a global feature of the network of probabilis- 
tic connections between the world’s events. In a detailed study of this account, 
Daniel Hausman recently concluded that the assumptions underlying this ac- 
count are a “useful approximation,” but that the presence of fork asymmetries 
is neither a necessary nor a sufficient condition for the existence of causal direc- 
tion (Hausman, 1998, pp. 239-242). Huw Price (1992) has argued that “fork 
asymmetry is not a sufficiently basic and widespread feature of the world to 
constitute the difference between cause and effect.” As I discuss in appendix B, 
fork asymmetries and the screening off of probabilistic dependencies by common 
causes play an important role in the epistemology of causation and in our use 
of causation in drawing inferences, but I agree with Hausman and Price that 
they seem misplaced when pressed into metaphysical service as an analysis of 
the essence of causal direction. 

The account of the direction of causation in terms of the increase in entropy 
faces similar difficulties. Although unlikely, a decrease in entropy from cause to 
effect does not seem to be essentially impossible. 

There are two problems with accounting for the asymmetry of cause and 
effect by reference to human agency, i.e., to the fact that we manipulate causes 
in order to bring about effects, and not vice versa. First, this account seems too 
narrow, since it excludes causal relations from things that, because they are too 
large or too small, too fast or too slow, cannot be controlled by human beings. 
Second, it denies the objectivity of causal asymmetry, reducing it to a merely 
anthropocentric phenomenon. 

Since Hume, it has been popular to explain the difference between cause 
and effect by reference to time: causes always precede their effects. This has 
two major drawbacks. First, it rules out backward causation without sufficient 
warrant. For example, some recent interpretations of quantum mechanics have 
taken the possibility of temporally reversed causation seriously, and discussions 
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of tachyons and Feynman electrons also presuppose the possibility of backward 
causation (Price (1996), Cramer (1986)). Second, it would rule out any causal 
theory of time, rendering such an account circular. Causal direction seems 
promising as an account of the direction of time (with time being the axis 
through local spacetime that agrees with the predominant direction of causation 
in that neighborhood), as I suggest in section 4.10.2 and 4.10.3. 

I account for causal asymmetry in chapter 5 in terms of the asymmetric 
necessitation of token-causes by their token-effects. The tokens causally prior 
to a given token are essential to its identity: it wouldn’t be the very situation- 
token it is were those causal antecedents either added to or subtracted from. 
This asymmetry corresponds to the fixity of the past and the openness of the 
future. This fragility of the identity of events will prove quite useful in making 
sense of cases of preemption. In section 10.2, I summarize the advantages of 
this account of asymmetry. 


2.13 Distinctive Features of My Theory 


To summarize, I will mention five distinctive features of my theory of causation: 


1. Causal priority is treated as a modal relation among tokens, not superve- 
nient on general and non-causal facts (as per Davidson, Cartwright, and 
Tooley). 


2. The theory includes clear and precise conditions for event-token identity, 
namely, sameness of parts, intrinsic types, and causal antecedents. 


3. Modality (possibility and necessity) and objective probability play a cru- 
cial role in the definition of causal relations. There is a seamless general 
theory covering both deterministic and indeterministic causation, and a 
link is established to the theories of probabilistic causality and causal in- 
ference in the work of Skyrms, Eells, and Pearl, and in the joint work of 
Spirtes, Glymour, and Scheines. 


4. The theory of mereology (of parts and wholes) is used to construct an 
improved version of Mackie’s INUS conditions. 


5. The theory of causation is comprehensive, including both token-level and 
type-level relations, and including causal relations among events, states, 
dispositions, and modal and causal facts. 


Although IJ will make use of merely possible, and even of impossible, situa- 
tions in my models of partial modal logic, these are to be thought of as mere 
artifacts of the models. The only situations that really exist are actual situations 
(this is the thesis of actualism).1 Modality is a property not of non-actual to- 


1For defenses of actualism, see Adams (1981), Fitch (1996), and Menzel (1991). 
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kens, but of types.? Certain situation types have the property of being possibly, 
but not actually, instantiated. It is convenient, as Leibniz, Kanger, and Kripke 
have discovered, to represent these possibilities in formal models by means of 
merely possible worlds or situations, but the use of such models should not be 
taken as committing the theorist to the real existence of such merely possible 
worlds. It is because I take such an instrumentalist view of merely possible sit- 
uations that I am able to take on board impossible situations with equanimity, 
and with very useful results. 

My account is, without apologies, anti-Humean and non-empiricist. I give 
no special priority to non-modal or occurrent properties, or to properties with 
which we are immediately acquainted. I do not take the structure of space 
and time as given prior to my account of causation: instead, I seek to lay 
the groundwork for a causal theory of spatiotemporality. Natural laws, in the 
sense of merely extensional generalizations that we happen to find attractive or 
economical, or that just happen to form a simple and powerful theory of the 
world, play no role in my account. Causation is not taken to be a projection 
of our minds, or a function of our practices or preferences. If causality is not 
taken realistically, nothing else can be. 


*Strictly speaking, I should say that modal types are properties of actual situations. For 
each situation-type ¢, there exists a corresponding modal type O¢, which is true in an actual 
situation s just in case s supports the possible instantiation of @. 


3 


Situation Theory and 
Causation 


3.1 The Need for Situation Theory 


3.1.1 Naked Infinitives and Situation Theory 


In their seminal book Situations and Attitudes (Barwise and Perry (1983)), Jon 
Barwise and John Perry introduced the basic elements of situation theory. These 
elements included: the existence of concrete parts of the world, called situations 
or situation-tokens, a realistic attitude toward abstract situation-types, and the 
use of a partial, non-classical semantics for the relationship between situation- 
tokens and situation-types. Barwise and Perry made use of the strong Kleene 
tables of three-valued logic, making use of the three values true, false, and 
undefined. 

Here are the strong Kleene truth tables for negation, disjunction, and con- 
junction. 


32 Realism Regained 


The principal motivating data for early situation theory was work done by 
Barwise on the semantics of naked infinite clauses within perceptual contexts. 
For example, consider the contrast between these two sentences: 


Mary saw that John was smiling. 
Mary saw John smile. 


The first sentence, with a “that”-clause in the complement position, entails 
that Mary knew that she was looking at John and was aware that he was smiling. 
The second, with a naked-infinitive clause as complement, entails neither of 
these: it could be true even if Mary believed that she was seeing Paul wince. 

Barwise and Perry argued that the most natural way to understand naked- 
infinitive perceptual reports was to take the object of the perceiving to be a part 
of the world — in the case of seeing, a scene. A scene makes some sentences 
true and others false, and still others are made neither true nor false by the 
scene. The semantical relations between a scene and a sentence are embodied 
by the strong Kleene tables: if a scene s makes a disjunctive sentence (p V q) 
true, then it must make p true or make gq true. If s makes (p&q) true, then it 
must make both p and q true. 

Barwise and Perry argue that the following principles of naked-infinitive 
reports are intuitively plausible (Barwise and Perry, 1983, pages 181-182). 


1. Principle of Veridicality: if b sees ¢, then ¢. 
2. Principle of Substitutivity: if b sees d(t1), and ty; = te, then b sees (ta). 


3. Existential Generalization from Definite Descriptions: if b sees ¢(the7), 
then there is something; such that b sees (it). 


. Negation: if b sees -¢, then b doesn’t see ¢. 
Conjunction Distribution: if b sees (¢ & w), then b sees ¢ and b sees w. 


Disjunction Distribution: if b sees (@ V 7), then b sees ¢ or b sees w. 


SS Oe oe 


Distribution of Indefinite Descriptions: if b sees ¢(a7), then there is a 77 
such that 6 sees @(it1). 


For example, Barwise and Perry suggest that if Ralph sees Ortcutt or Hort- 
cutt hide the letter, then either Ralph sees Ortcutt hide it, or Ralph sees Hort- 
cutt hide it. 


3.1.2 From Perception to Causation 


Naked-infinitive perceptual reports are merely a special case of a much wider 
phenomenon. It is the causal element that it crucial to the features of naked- 
infinitive perceptual reports that Barwise observed. When b sees a scene s, there 
is some sort of causal connection between s and the perceptual state of b. The 
distinctive logical properties of causation are reflected in the case of perception. 
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We can see a parallel by considering the agentive use of the verb ‘to make’, 
for example: 


Mary made John smile. 


This use of to make, like the use of verbs of perception studied by Barwise, 
takes naked-infinitive clauses in the complement position. The same principles, 
such as veridicality, disjunction distribution and conjunction distribution, apply 
in this case as well. If Ralph makes Ortcutt or Hortcutt hide the letter, then 
either Ralph makes Ortcutt hide the letter, or Ralph makes Hortcutt hide the 
letter. 

The same logical phenomena can be seen in cases without the presence of 
naked-infinitive clauses. For example, consider the relation of causal relevance. 
We can express a relation of causal relevance between two states, events, or 
conditions by the use of gerundive phrases. For example: 


Mary’s dancing was relevant to John’s smiling. 
We can also nominalize the events, as in the expression fire: 
The fire was relevant to the water’s boiling. 


J. L. Mackie’s INUS account (insufficient but necessary part of an unneces- 
sary but sufficient condition) can be thought of as an attempt to formalize this 
relation of causal relevance (Mackie (1965)). If we think of causes and effects 
as parts of the worlds (i.e., as situation-tokens), then token s is an INUS cause 
of token s’ just in case s is an indispensable part of a token s” that is sufficient 
to account for s’, that is, no part of s” that does not include s is sufficient to 
produce s’. 

The relation of causal relevance, whether taken intuitively or as refined in 
Mackie’s analysis, satisfies the semantic principles that Barwise and Perry dis- 
covered in the case of perception. 


(DD) (A V B) is relevant to C ==> (A is relevant to C) V (B is relevant to C) 
(CD) (A & B) is relevant to C ==> (A is relevant to C) & (B is relevant to C) 


According to the principle DD, causal relevance distributes over disjunction, 
and according to CD, it also distributes over conjunction. Principle CD is not 
plausible if we replace the relation of being relevant to with the relation of being 
a total cause of, but in fact we rarely make reference to such total causes in 
everyday life. If we mean by causal relevance something like being an essential 
part of a total cause, something that is necessary in the circumstances, then 
principle CD is clearly correct. 

Principle DD is a special case of the referential transparency of causal con- 
texts. If the condition that Fa is relevant to p, and a = b, then the condition 
that Fb is relevant p. Similarly, if dx Fx is relevant to p, then there must exist 
an a such that Fa is relevant to p. If p is true and q is false, then pV gq is merely 
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another way of referring to the condition that p, and so, if p V q causes r, so 
does p itself. 

However, if we combine these principles with the assumption that classically 
equivalent sentences are inter-substitutable in causal contexts, absurdity quickly 
results. 


1. The fire’s burning was relevant to the water’s boiling. (Premise) 


2. (The fire’s burning and the moon’s eclipsing the sun) or (the fire’s burning 
and the moon’s not eclipsing the sun) was relevant to the water’s boiling. 
(1, Substitution of classical equivalents) 


3. ((The fire’s burning and the moon’s eclipsing the sun) was relevant to the 
water’s boiling) or ((the fire’s burning and the moon’s not eclipsing the 
sun) was relevant to the water’s boiling). (2, DD) 


4, The moon’s eclipsing the sun was relevant to the water’s boiling, or the 
moon’s not eclipsing the sun was relevant to the water’s boiling. (4, CD, 
positive dilemma) 


Since there are so many different variant concepts of causation, there is some 
legitimate worry that the plausibility of CD and that of DD are due to the use 
of disparate versions of causation, a kind of fallacy of equivocation. However, I 
can construct a reductio that uses only CD and the principle of the substitution 
of classical equivalents. 


1. The fire’s burning was relevant to the water’s boiling. (Premise) 


2. (The fire’s burning and (the moon’s eclipsing the sun or the moon’s not 
eclipsing the sun)) was relevant to the water’s boiling. (1, substitution of 
classical equivalents) 


3. (The moon’s eclipsing the sun or the moon’s not eclipsing the sun) was 
relevant to the water’s boiling. (2, CD) 


Thus, any tautology would be causally relevant to any actual fact, surely an 
inappropriate result. 

The obvious semantic solution is to replace worlds with partial situations, 
employing strong Kleene (three-valued) evaluations. Strong Kleene equivalents 
are substitutable in causal contexts. This is the generalization of Barwise and 
Perry’s work (Barwise and Perry (1983)) on the semantics of perception reports. 
Perception includes a causal component, which explains the behavior of naked 
infinitive perception reports. Consider the naked-infinitive report “Smith sees 
the fire burn and the moon eclipse the sun.” This report entails that the both the 
fire’s burning and the moon’s eclipsing of the sun are causes (in Mackie’s INUS 
sense) of Smith’s visual experience (principle CD). Similarly, if Smith sees the 
fire burn or the moon eclipse the sun, then either Smith sees the fire burn, or he 
sees the moon eclipse the sun (principle DD). The explanation for why classically 
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equivalent expressions are not inter-substitutable in naked-infinite reports (as 
observed by Barwise and Perry) is that naked-infinitive reports entail a causal 
connection, and classically equivalent expressions are not substitutable in causal 
contexts generally. 


3.1.3. The Frege-Church Slingshot 


Barwise and Perry made use of a second argument, one that had originally been 
used by Frege and by Church (1943), as well as by Quine (Quine, 1976, p. 163- 
164) and Davidson (Davidson, 1984, page 19), to argue against the existence of 
such things as fine-grained as facts or situations. Barwise and Perry call this 
argument the slingshot. The argument depends on the following transparency 
principle, which Barwise and Perry apply to the contexts of naked-infinitive 
perception reports: 


(SCDD) If The F = TheG, then ¢[The F] ~ ¢[The G]. 


So, for example, if Jane is both the youngest spy and the secretary of the 
French club, then we can infer either of the following from the other: 


(1) John sees the youngest spy yawn. 
(2) John sees the secretary of the French club yawn. 


Of course, (1) does not imply that John sees or even knows that Jane is the 
youngest spy, and (2) does not imply that John sees that the person yawning 
was the secretary of the French club, but, since Jane is that secretary, by seeing 
Jane yawn, he did see the secretary yawn. 

Similarly, causal contexts are referentially transparent. If John made the 
youngest spy angry, and the youngest spy is the secretary of the French club, 
then John made the secretary of the French club yawn. However, if we combine 
the principle of (SIDD) (the substitution of co-referring definite descriptions) 
with the substitution of classical equivalents, we can use the slingshot argument 
to derive the absurd result that every fact causes every other fact. Suppose that 
the fire’s burning is causally relevant to the water’s boiling. The proposition 
the fire is burning is logically equivalent to the identity: 


{0} = {x : 2 =@&The fire is burning} 


By the substitution of classical equivalents, this identity’s holding is also 
causally relevant to the water’s boiling. Suppose that the moon is eclipsing the 
sun. Then the following identity is true: 


{z:2=@&The fire is burning} = 
{x :2=()&The moon is eclipsing the sun} 
By (SCDD), since these specifications of the set {@} are definite descriptions 


of a sort, we have that the following identity’s holding is also causally relevant 
to the water’s boiling: 
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{0} = {x : 2 =@& The moon is eclipsing the sun} 


Finally, by the substitution of classical equivalents, we reach the absurd 
conclusion that the moon’s eclipsing of the sun is causally relevant to the water’s 
boiling. Since the result is absurd, and the transparency principle (SCDD) seems 
to hold of causal contexts, we have an independent argument for rejecting the 
substitution of classical equivalents. 

Mellor has recently argued that causal contexts are not referentially transpar- 
ent. For example, suppose Don falls during a rock-climbing expedition because 
his rope broke. Suppose that Don’s fall was the first of the expedition, and 
suppose that his rope was the weakest. Mellor argues (Mellor, 1995, p. 115) 
that we can accept (3) without accepting (4) or (5): 


(3) Don’s fall is the first because his rope was the weakest. 
(4) Don’s fall is Don’s fall because his rope was the weakest. 
(5) Don’s fall is the first because the weakest rope was the weakest rope. 


In these cases, the definite descriptions the first fall and the weakest rope 
are doing more than simply picking out a particular fall or a particular rope. 
They are also making reference to causally relevant features of the situations. 
We have moved from the assertion of causal relevance between two tokens to 
the causal explanation at the level of types. If we consider instead (6) and (7): 


(6) The weakness of Don’s rope was causally relevant to his fall. 
(7) The weakness of the weakest rope was causally relevant to the first fall. 


we can see that substitution of co-referring definite descriptions in these contexts 
is wholly unproblematic. More generally, suppose that c is the one and only K, 
and e the one and only L. In the statement 


Kceis causally relevant to Le 


we can freely substitute the K for c and the L for e: 


The K’s being K is causally relevant to the L’s being L. 


In this case, as Mellor notes (Mellor, 1995, p. 152), the definite descriptions 
are being used as rigid designators of c and e, and so cause and effect are both 
merely contingent, despite their apriority. 


3.1.4 The Transitivity of Causation 


It is a commonplace that causation is transitive: if A is a cause of B, and B is 
a cause of C, then A is a cause of C. This transitivity, however, is a difficult 
thing to account for naturally. Positive statistical correlation is not transitive: 
it is quite possible for A to be positively correlated with B, B to be positively 
correlated with C, yet A to be independent of, or even negatively correlated with 
C. Similarly, counterfactual dependence is not transitive: from the fact that B 
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wouldn’t have happened in the absence of A, and C’ wouldn’t have happened 
in the absence of B, it does not follow that C wouldn’t have happened in the 
absence of A. 

Of course, it is always possible to take the transitive closure of a non- transi- 
tive relation (as David Lewis does in the case of counterfactual dependence), but 
by itself this is not enough to give an illuminating account of causal transitivity. 
What is needed in addition is a demonstration that the characteristic modal or 
statistical properties of the immediate causal connections are inherited by the 
connections established by transitive closure. This transference of characteristic 
properties from immediate to mediate connections fails in both the probabilistic 
and counterfactual analyses. 

Some have argued recently that causation is not transitive. For example, 
Michael McDermott (1995) proposes the following counterexample. A and B 
each control a switch: if both switches end up in the same position (left or right), 
person C’ receives a shock. Person A must set his switch first, in full view of 
person B, who wishes to deliver a shock to C. If A puts his switch right, this 
causes B to switch right also, which causes a shock to C. Similarly, if A sets 
his switch to the left, the result is that C is shocked. McDermott argues that 
A’s setting of his switch is not a cause of the shock, even though it is a cause 
of a cause. Simpler examples also exist: one’s birth is a cause of the various 
episodes of one’s life, at least one of which is a cause of one’s death. Hence, it 
seems that, if transitivity holds, one’s birth is a cause of one’s death. 

Although I admit that it sounds odd in these cases to insist that A’s setting of 
the switch is a cause of the shock, and that one’s birth is a cause of one’s death, 
these do not seem to me to be convincing counterexamples to the transitivity 
of causation. If a doctor causes a child to be born, and that birth causes a 
particular death from a genetic disorder, the doctor’s action can correctly be 
ascribed as one of the causes of the death, that is, of the particular death that 
resulted. 


3.2 Situation Mereology and Causation 


I will take facts or situations to be the relata of causation. A situation is a 
real, concrete part of the world, one that makes certain propositions true and 
other propositions false. There is one maximally large fact, that we can call the 
“world.” For each proposition p, p is either made true or false by the world. 
Smaller situations, proper parts of the world, make certain propositions true, 
others false, and leave still others undefined in their truth values. Hence, in rea- 
soning about facts, we must make use of a three-valued logic. The appropriate 
logic to use in this case seems to be the strong Kleene truth tables, in which, 
for instance, a disjunction is true if either of its disjuncts is true, false if they 
are both false, and undefined otherwise. 

Since facts are concrete parts of the world, it makes sense to apply mereology, 
the calculus of parts and wholes. Some standard symbols of mereology are the 
© (for overlap) and the C for the weak part-of relation (i.e., a C 6 iff either a 
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is a proper part of b or a is identical to b). There are three standard axioms of 
mereology: 


Axiom 3.1 ph geVr(rOp>rQgq) 
Axiom 3.2 dp ¢(p) > 3¢gVr (r Og & Ju (¢(u) EuOr)) 
Axiom 3.3 p=qe (po q&qLp) 


Axiom 3.1 defines the part-of relation in terms of overlap, and axiom 3.2 is 
an aggregation or fusion principle: if there are any facts of type ¢, then there 
is an aggregate or sum of all the ¢ facts. The mereological sum of facts of type 
@ is symbolized as “#¢(x).” Axiom 3.3 guarantees that the part-of relation is 
reflexive and anti-symmetric. 

I will represent the causal relation by means of the symbol >. 


Axiom 3.4 (Irreflexivity of Causation) ap b— 7(bC a) 


I will also assume that causation is closed under part-inclusion with respect 
to the effect, that is: 


Axiom 3.5 (Right Closure under Part) a> b&cCb—-ape 


The axioms of Irreflexivity and Right Closure immediately imply the sepa- 
ration of a cause from its effects: if a causes b, then a and 6 cannot overlap. 

A notion that will be prove to be useful is that of a coincident token. I will 
use the symbol = to represent a primitive relation of causal priority. Causal 
priority is a necessary, but perhaps not a sufficient condition, of the causal 
relation proper. A coincident token is one none of whose parts is causally prior 
to any other part. This means that all of the parts of a coincident token are 
in some sense simultaneous (in relativity theory, at a spacelike, rather than 
timelike, separation from one another). 


Definition 3.1 Co(a) og VzVy((c Ca&y Ca) > A(z <y)) 


Since it is impossible for a to cause b without a’s being causally prior to b, 
we have as an immediate corollary that if a situation is coincident, then no part 
of it causes another of its parts. 


Corollary 3.1 Co(a) > VrVy((r4 Ca&y Ca) > -7(¢e y)) 


We will at least consider the following hypothesis of causality: that if a 
causes b, and b can be extended to a coincident fact c, then a can be extended 
to a cause of c: 


Hypothesis 3.1 (Causality) (a> b&b C c& Cola) & Cofe)) > Adja Cd& 
Co(d) &d b> c) 


I will also assume that causation is transitive: 
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Axiom 3.6 (Transitivity) a> b&bec- abe 


In subsequent chapters, I will define the causal relation in terms of a notion of 
causal priority, and I will demonstrate there that the causal relation is transitive. 
However, for the moment, we will take the transitivity of total causation as a 
given. 

In the philosophical literature, two kinds of causes are distinguished: total 
causes, and essential parts of total causes. The latter were called INUS condi- 
tions by the British philosopher J. L. Mackie, where INUS stands for “insufficient 
but necessary part of an unnecessary but sufficient condition.” Unfortunately, 
Mackie’s account was subject to insuperable difficulties, because he did not make 
use of a mereological theory of facts as the relata of causation. We will under- 
stand the condition “necessary part” to mean that a fact is part of a minimal 
cause of the effect. In other words, I will take a > b to mean that a is a total, 
sufficient cause of fact b. Fact a is an INUS cause of b (symbolically, a ~ b) iff 
there is some total cause c of b such that: (i) a is a part of c, and (ii) no proper 
part of c is a total cause of b. 


Definition 3.2 (INUS Cause) a~ b og dela c& eb b& Vdd b&d 
cod=c)) 


In working with INUS conditions, it is useful to define the relation of being 
a minimal total cause of another event-token. 


Definition 3.3 (Minimal Total Cause) @ min b Ga Ve((c Ea&cp b) o 
c= a) 


One token is an INUS cause of another just in case it is part of some minimal 
total cause of the second. 


3.3. A Situation-Theoretic Logic of Causation 


In the previous section, I sketched out a theory of causation, treating causation 
as a relation between facts or situations. In this section, I want to build on that 
theory to create a logic of causation, treating causation now as a connective 
that combines two propositions or formulas. In order to do so, we must take 
each true formula of a formal language as standing for or representing some 
definite fact. I will take an expression ‘f*’ to stand for the sum of all the 
minimal verifiers of the formula ¢. A minimal verifier of a formula is a fact that 
makes the formula true but has no proper parts that make the formula true. 
For example, a minimal verifier of a disjunction will typically make one or the 
other, but not both, disjuncts true. If both disjuncts are true, the disjunction 
stands for the sum of the verifiers of the two disjuncts. If only one disjunct is 
true, the disjunction stands for the verifier of that disjunct. 

Formally, I define the factual correlate of a formula to be the mereological 
sum (represented by the £ notation) of all the facts that minimally verify that 
formula. 
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Definition 3.4 ||¢*|| =4 @Vy Ca(yRegur=y) 


Given this definition, we can stipulate the truth-conditions of causal formu- 
las involving formulas in terms of the underlying relations between the factual 
correlates of those formulas: 


1. (9> b) & |16"|| > ld" ll 
2. (d~ ) @ IIo" ~ Iv" 
3. (PE Y) > Ie" E |e" 


We are now in a position to resolve a puzzle: the failure of the substitution 
of classical equivalents within logic contexts. Two formulas can be counted on 
to stand for the same fact only if they are strong-Kleene equivalent. Strong- 
Kleene equivalence is a much stronger condition than classical equivalence. For 
example, (¢V w) is strong-Kleene equivalent to (7 V @), since they are verified by 
exactly the same facts. However, ¢ is not strong-Kleene equivalent to ((¢& p) V 
(¢ & -w)), despite the fact that these are classically equivalent. 

Here again are the strong Kleene truth tables for negation, disjunction, and 
conjunction. 


The following logical principles are confirmed by the present understanding 
of the meanings of the causal connectives: 


-(@VP) Ex) (Ex) VWExX)) 
2. ((e@&p) Ex) (PE x) &(pEx)) 
3. ((O@V¥) > x) = (Or XV (Y~ x) 
4. (¢&p)~ x) > (EO x) &(Y~ x)) 


1 
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5. ((6> x) & (Gb ¥)) o (> (x&y¥)) 
6. (9 > WP) & (xb C) & Co(¢&x)) — ((P& x) > (P&C) 


The expression C'o(¢) has as its truth-conditions Vx C ||¢*{|Vy E ||¢*|| -(a < 
y). 


3.3.1 Failures of Substitution 


Substitution of classical equivalents clearly fails in this framework. To return to 
the example, in which it appeared that an eclipse caused the water to boil, we 
can see that the reductio fails at step 2: from the fact that fire caused the water 
to boil, we cannot conclude that the complex condition consisting of fire and 
eclipse or fire and no eclipse did so. Situations exist that verify the occurrence of 
fire without verifying this disjunction: namely, situations containing information 
about the fire but no information about the occurrence or non-occurrence of the 
eclipse. 


3.4 The Transitivity of INUS Causation 


Is INUS causation a transitive relation? Suppose a ~» b, and b~+ c. By 
Causality and the Transitivity of >, it follows that a is part of a total cause of c. 
However, it does not follow that a is part of a minimal total cause of c. Suppose, 
for example, that c was overdetermined, that there were two independent causes 
of c. It could be that a is then partly redundant as a cause of c, despite its 
being non-redundant as part of a cause of b. In order to make INUS causation 
transitive, we would have to add something like the following two theses: 


Hypothesis 3.2 (No Overdetermination) (a>b&cb>b&Co(alic)) — (an 
c)bb 


Hypothesis 3.3 (No Action at a Distance) (a> b&bec&d Cake £ 
c&dpe) > da(Co(xUb)&dpr&srepe) 


No Overdetermination stipulates that if a and c are both total causes of b, 
and a and c are coherent, then the mereological intersection of a and c exists 
and is also a total cause of b. No Action at a Distance requires that if a causes 
b and 5 causes c, d is a part of a, e is a part of c, and d is a cause of e, then 
there must exist a causal chain leading from d to e that co-exists with b. 

It seems reasonable to assume that if one situation is a minimal cause of 
another, then the first must be coincident. If it were not coincident, then it 
would have some redundant parts, namely, those that are caused by other of its 
parts. 


Hypothesis 3.4 (Coincidence of Minimal Causes) a> min b — Co(a) 


These hypotheses entail that INUS causation is transitive. 
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Theorem 3.1 (Transitivity of INUS, I) Causality, Transitivity, No 
Overdetermination, No Action at a Distance, and Coincidence of Minimal 
Causes entail that INUS causation is transitive. 


Proof: Suppose a ~» b and b~» c. We must show that a~ c. By definition 
of ~, we have that a Ed, dD>min b, 6 C e, and e Pmin c. By the Coincidence 
of Minimal Causes, we know that d and e are each coincident. By Causality, it 
follows that there exists an f such that d C f, Co(f), and f pe. Since f De 
and ec, it follows by the transitivity of > that f > c. It suffices to show that, 
for every gC f,ifgoec, thenaCg. 

Suppose that g C f and gc. It suffices to prove that a CL g. Since we have 
foe, epegCf,cCe, and ge, it follows from No Action at a Distance 
that there is an h such that Co(hUe),g>handhpec. Sinceh>candebe, 
by No Overdetermination, we have that (hfe) bc. Since e is a minimal cause 
of c, it must be that e C (hMe), which means that e £ h. By Right Closure, 
it follows that g > e. By Right Closure again, since b CE e, it follows that g > b. 
Since f is coincident, and d and g are both parts of f, it must be that dU g is 
also coincident. 

Since g > b, d> b, and Co(dU Q), it follows by No Overdetermination that 
(dg) bd. Since d is a minimal cause of b, it must be that d C (dg), which 
means that dE g. Since a CE d, we can conclude that aC g. QED 

The No Action at a Distance axiom seems reasonable, since we do have a 
strong predilection toward believing in the existence of an intervening chain of 
events linking any cause and effect widely separated in time. However, the No 
Overdetermination axiom seems too strong: although unlikely, overdetermining 
causes are not altogether impossible. It seems reasonable to weaken the No 
Overdetermination axiom into a defeasible or default rule, with the consequence 
that the transitivity of INUS causation is to be expected as a rule, but not 
without exceptions. 

There is another way of securing the transitivity of INUS causation. Instead 
of the No Overdetermination condition, we could impose a condition of strict 
downward monotonicity of causation: if a is a total (not INUS) cause of b, and 
c is a proper part of b, then there is always a proper part of a that is a total 
cause of c. This implies the absence of granularity in the causal structure of 
the world. It also implies that if a is a minimal cause of b, then b is a maximal 
effect of a. 


Hypothesis 3.5 (Strict Downward Monotonicity) (a > b)&(ce C b) > 
dd((d C a) & (db c)) 


In order to secure transitivity of INUS, we need a generalized version of the 
hypothesis of Causality: 


Hypothesis 3.6 (Generalized Causality) (@ >min 6) &(b E c)&Cole) — 
dd(d min c&a C d) 


In addition, we need to assume that any coincident extension of a cause is 
still a cause, and that the sum of two effects is also an effect. 
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Axiom 3.7 (Extensibility of Causes) (a> b) &(a Ec) & Co(c) > (eb b) 


Axiom 3.8 (Right Closure under Sum) (a> b)& (abc) > (ab (bUc)) 


Theorem 3.2 (Transitivity of INUS, II) Strict Downward Monotonicity, Gen- 


eralized Causality, Transitivity, and No Action at a Distance entail that INUS 
causation is transitive. 


Proof: Assume that a ~ b and b ~~ c. We must show that a~+c. By 
definition of ~», we have that a EC d, dbmin b, b C e, and € Pmin cc. By 
Generalized Causality, it follows that there exists an f such that dC f, Co(f), 
and f>mine. Since fD>e and eve, it follows by the transitivity of > that fre. 
It suffices to show that for every g EC f, ifgp>c, thenaC g. 

Suppose that g EC f and g>c. By No Action at a Distance, it follows that 
there is an A such that g> h, hc, and Co(hUe). By Extensibility of Causes, 
since g>h, gC f, and Co(f), it follows that f > h. Since f > h and f be, by 
Right Closure Under Sum, we have f > (h Le). 

Suppose for contradiction that h Z e. Then e C (hUe). By Strict Downward 
Monotonicity, it follows that there is a 7 C f such that j7>e. But this contradicts 
the fact that f is a minimal cause of e. Thus, AC e. 

Since hc, €minc, and h C e, it follows that h = e. We know that gb h, 
so g>e. Since g C f, and f is a minimal cause of e, it follows that g = f. Since 
aCd,dCf, and f = g, we have thatal g. QED 


A 


A Deterministic Model 


In this chapter, I will employ the logic of situations (including the partial modal 
logic laid out in appendix A) in developing definitions for a family of causal no- 
tions. The atomic formulas of the language, or — to use the material rather than 
the formal mode — the basic situation types, consist of types of the following 
forms: 


e ¢,¥,x,... (basic atomic types) 
O¢, O¢ (modal types) 


e sC s’,s =s' (mereological types) 
e s|= ¢ (classificatory types) 

e As (actuality types) 

e s <s' (causal priority types) 


All of these types, with the exception of the last one, are given precise defi- 
nitions in appendix A. In this chapter, I will treat the causal priority relation < 
as a primitive, corresponding to a classical (bivalent) and mereologically persis- 
tent binary relation on the class of situation-tokens. In the next chapter, once 
I have moved to an indeterministic model of causation, I will be able to offer a 
definition of causal priority in terms of modality and mereology. 

As explained in appendix A, the class of situation types is closed under the 
logical operations corresponding to the usual connectives of classical predicate 
logic (negation, conjunction, disjunction, and existential and universal predica- 
tion). 


4.1 Desiderata 


An adequate theory of causation would provide a framework for evaluating philo- 
sophical arguments involving causal notions, and for checking the consistency 
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and unanticipated consequences of claims made about causation in the course 
of theorizing about other subjects. An exact theory of causation would also be 
of value to linguists and researchers in artificial intelligence, in providing a for- 
mal language adequate to the task of representing causal information carried in 
natural language or implicit in the design specifications for an intelligent agent. 

However, before looking at such specific applications of causal theory, it is 
possible to identify a number of at least prima facie desirable features for the 
prospective theory, based on existing philosophical insight into the nature of 
causation. There are five such desiderata of special significance: 


1. Veridicality, Asymmetry, and Transitivity. Causation is veridical 
(both causes and effects are actual), asymmetric, and transitive, and, in 
an optimal theory, these properties would be natural consequences of some 
more fundamental properties, and not generated in an ad hoc fashion, by, 
for example, taking the transitive closure of some non-transitive relation. 


2. Constructibility of Spacetime. This is a more controversial desidera- 
tum, but a number of philosophers have attempted to give causal theories 
of time. This would effect considerable simplicity in our account of the 
world, so an ideal theory of causation would make no use of spatial or 
temporal concepts, leaving open the possibility of defining such concepts 
in causal terms. 


3. Modal Facts as Causes. This is also a more controversial desideratum 
than the first one. I want to give a causal account of our knowledge 
of modal truths (truths about possibility and necessity), and I hope to 
extend this to a causal account of logical and mathematical knowledge. 
Consequently, I want a theory of causation that leaves open the possibility 
that “eternal” facts, like modal facts, could enter into the causal nexus. 


4, Formalizability of Teleological Explanations. The final cause or ex- 
planation seems to be causally posterior to what it explains. The mystery 
is: how can the order of explanation reverse the order of efficient cau- 
sation? There would seem to be some relationship between teleological 
and causal explanation, and it is to be hoped that a formalism for causal 
reasoning will clarify both the nature of teleological explanation and its 
relation to causation. 


5. Compatibility with Indeterminism. In all likelihood, the actual world 
is not deterministic, but that does not rule out causality. An adequate 
theory of causation should be robust enough to encompass the possibility 
of indeterminism. 


In this chapter, I will lay out a formal theory of causation that clearly satisfies 
the first four desiderata. It will, however, presuppose a deterministic concep- 
tion of causation (according to which causes strictly necessitate their effects). 
In the following two chapters, I will show how to modify this deterministic ac- 
count in order to make it compatible with indeterministic, probabilistic forms 
of causation. 
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4.2 Causation and Determinism 


4.2.1 Token Determinism and Type Determinism 


The thesis of determinism has two quite distinct versions, one applying to 
situation-tokens and the other to situation-types. Token determinism would 
be the view that for every token outside of a special class of uncaused tokens, 
there is a causally prior, strictly sufficient condition for the existence of that 
token. In other words, for every caused token s, there exists a causally prior 
token s’ such that the existence of s’ strictly necessitates the existence of s. 

Type determinism is the view that for every caused token s and every type 
¢ such that s belongs to @, there is a situation s’ that is immediately prior to s 
and a type w such that s’ is of type , and the existence of a situation of type w 
strictly necessitates the existence of an immediately posterior situation of type 
op. 

It is natural for a determinist to identify token causation with necessitation 
by a causally prior token, and to identify causal explanation with the necessi- 
tation of a type by the type of a causally prior token. And that is exactly what 
I will do in this chapter. However, although these definitions of causation and 
causal explanation are quite simple and have a number of desirable features, 
they possess one very serious defect. If we define causation as a species of strict 
necessitation, we are forced to the conclusion that a world without strict neces- 
sitation is a world without causation. We make causal idioms inapplicable to 
an indeterministic situation. Consequently, in the following two chapters, I will 
develop definitions of causation that are compatible with indeterminism. 

Of course, a determinist need not define causation as a species of strict 
necessitation. Indeed, we will see that there are reasons for not doing so, even 
if determinism were true. 

The theses of token determinism and type determinism are independent. 
Much turns on the identity criteria for tokens. If we accept the principle that 
two tokens belonging to the same types and having the same causes are identi- 
cal, and we assume that every token belongs to its types essentially, then type 
determinism entails token determinism. However, token determinism does not 
entail type determinism, even if we assume that every token has all of its types 
essentially. A token s might cause token s’ with necessity, and s’ might belong 
to all of its types essentially, but it does not follow that the types of s necessitate 
the types of s’. 


4.2.2. Expressibility of Strictly Sufficient Conditions 


A total cause is something like a sufficient causal condition. However, there are 
two problems with simply defining a cause as a strictly sufficient condition. First 
of all, such a definition would make the non-existence of all sorts of recherché 
events part of the cause of actual events. For example, the cause of the starting 
of my car’s engine would include the non-existence of an immobilizing ray emit- 
ted from a passing UFO. I would like a definition of cause that would not require 
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the inclusion of such purely negative and highly probable conditions. Second, we 
might be dealing with a model in which the only genuinely sufficient conditions 
are inexpressible as types. For example, all sufficient conditions might involve 
non-denumerably many atomic types, and our set of situation-types might be 
denumerable. 

Nonetheless, in this chapter I will set aside these worries, and take up the 
task of defining causation without strictly sufficient conditions in subsequent 
chapters. 


4.2.3 Empiricist Conceptions of Determinism 


Some philosophers, including van Fraassen (1987), Earman (1986), and Lewis 
(1983), have argued that empiricism implies that all modal properties (including 
which causal laws obtain) supervene on the distribution of occurrent properties 
in the world. In other words, if two worlds agree on the distribution of oc- 
current properties (like the primary qualities) throughout space and time, then 
they must also agree on all modal properties. One consequence of this version 
of empiricism is that we cannot have two worlds w, and w2 which agree in 
all of their occurrent properties, and so agree on all causal laws, if these are 
interpreted as the simplest and most powerful extensional generalizations, but 
which are such that in w, these laws actually constrain the course of events, 
making alternative paths impossible, while in wa, the course of events merely 
happens to conform to the laws by mere happenstance. The inability to make 
such a distinction is a fatal flaw in the empiricist approach, as Tooley (1987) 
has argued. When we reach such an unacceptable result, it is time to go back 
and examine the credentials of the “empiricism” that led us there. 

As I will argue in part II, there is no reason to think of irreducibly modal 
properties as epistemologically problematic in a way that occurrent properties 
are not. Lewis and others have been led astray by a version of the Myth of the 
Given, according to which occurrent properties (like the primary and secondary 
qualities) are somehow directly present before the mind in a way that modal 
properties cannot be. In fact, there is no reason to think that our occurrent- 
property belief-forming apparatus is more basic or more reliable than our modal- 
property belief-forming apparatus. 

Indeed, from the perspective of causal or naturalized epistemology, occurrent 
properties could not be known were it not for irreducibly modal properties 
that accompany them, underlying the relation of reliability that constitutes 
the core of knowledge. Natural selection favors mental constitutions that are 
attuned to various modal and stochastic constraints. Natural selection rewards 
thinkers who successfully track what might /must/probably would happen under 
various situations, objectively speaking. Naturalized epistemology thus favors 
the modal realist. 

The empiricist approach to modality is based on a failed Humean epistemol- 
ogy. Empiricists also owe us an account of how our experiences are experiences 
of the instantiation of occurrent properties. I develop a causal-teleological ac- 
count of intentionality in part II. There is no competing empiricist account on 
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the table. 

Empiricists such as Lewis (1983) define determinism by first defining a de- 
terministic set of laws. A set of laws is deterministic if it is impossible for two 
worlds that conform to a set of laws to agree in all their properties at one point 
in time and yet converge thereafter. A world is deterministic if its laws are de- 
terministic, where the laws of a world are identified with the set of extensional 
generalizations that combines the greatest simplicity with the greatest content. 
Thus, the empiricist account of determinism, unlike mine, makes no reference 
whatsoever to modal properties, such as necessity. 

If we try to translate Lewis’s empiricist account of determinism into the 
terms of modal realism, we might come up with something like this. According 
to Lewis, the laws of nature are contingent. However, in order to explain the 
fact that laws support counterfactual conditionals, Lewis supposes that the laws 
of the actual world hold throughout some neighborhood of the actual world in 
logical space. Corresponding to this neighborhood is a proposition, N,, which 
for Lewis is a set of possible worlds. 

Let v, be a situation-type that corresponds to the proposition Ng: a to- 
ken s is of type Yq just in case every possible world that extends s is in 
the neighborhood N,. Suppose the laws of nature include the generalization 
Va((Aa & (|= ¢)) > dy(Ay& x Xo y& (y|= W))), that is, every actual situa- 
tion of type ¢ is followed by an actual situation of type ~. Suppose situation- 
token s is of type ¢ and of type vg. If s is actual, then since it is of type va, 
every possible world it is part of must support all the actual laws of nature. 
These laws include the one connecting ¢ and 7. Hence, the existence of such 
an s strictly necessitates the subsequent existence of a token of type w. 

Since all of the laws are deterministic, it follows that every temporally 
bounded situation is strictly necessitated by some earlier situation-token. Thus, 
Lewis’s account can be seen as a version of the strict-necessity account developed 
in this chapter. 


4.3 Basic Ontology 


In this section, I will introduce the model structures to be taken as formal 
representations of real possibilities concerning causation. These structures in- 
corporate two kinds of individuals: situation-tokens and situation-types. Ac- 
tual situation-tokens are to be thought of as real, concrete parts of the world, 
analogous to Davidsonian events. Merely possible situation-tokens are abstract 
objects, constructible from actual tokens and types, and representing possible 
but unrealized actualities. Each token carries a certain amount of information 
or fact about the world; these units of fact are represented as situation-types. 
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4.3.1 Classification Systems 


A classification system consists of a set of tokens, a set of types, and a binary 
relation on the two sets (the classification relation).1 For my purposes, the set 
of tokens will be a set of situation-tokens, the set of types situation-types, and 
the classification relation the verification relation |= defined in the last chapter. 


4.3.2 Models 


Each model shall contain a classification system, together with two partial or- 
derings on situation-tokens, E and <. The first represents the part-whole re- 
lationship of standard mereology. The second, a strict partial well-ordering, 
represents the relation of causal precedence. In addition, there shall be two 
modal accessibility relations, R' and R!. 

Consequently, a standard, deterministic model M consists of an n-tuple: 


(Sit, Typ, R', R!, , C, <), where: 
e Sit is a nonempty set, the set of situation-tokens. 


e Typ is a nonempty set of situation-types, closed under the various logical 
and modal operations. 


R! and R! are binary relations on Sit, the outer and inner accessibility 
relations introduced in the last chapter. 


-= is a binary relation on Sit x Typ. 
e Cis a partial ordering of Sit (antisymmetric and transitive). 


e ~ is a strict partial ordering on Sit (irreflexive and transitive). 


4.3.3 Situation Types 


There are a set of primitive, atomic situation-types, corresponding to the simple, 
atomic formulas of the last chapter. There are complex types of the following 
kinds: 


e (s £8’), where s and s’ are situation-tokens 

e (s[= @), where s is a situation-token and ¢ is a situation-type 

e As, where s is a situation-token 

e O¢ and O¢, where ¢ is a situation-type 

1These structures have been independently discovered many times over. Birkhoff (1940) 
called them “polarities,” and Hardegree (1982) called them “contexts.” They were also in- 
vented by the German mathematician Wille, whose work is discussed in Davey and Priestley 


(1990). More recently, Vaughan Pratt and his colleagues, working in the field of theoretical 
computer science, have called them “classification systems.” 
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These categories of types were introduced in the situation logic of the last 
chapter. In this chapter, I add a new kind of atomic type: s < s’, expressing the 
causal priority of s over s’. Corresponding to this type in the class of models 
is a binary relation < between situation-tokens. In this chapter, I will treat 
causal priority as a primitive relation, and, for simplicity’s sake, I will assume 
that all situations have concordant and complete information about the priority 
relation, so: 


M,s = (s' ~ 8”) © (s',8”) Ex 


4.3.4 Persistence of Situation-Types 


One situation excludes another whenever there exists no situation containing 
both of them as parts. If we assume that all actually possible situations are 
coherent, in the sense that there is no type ¢ such that the situation belongs 
to both ¢@ and -¢@, then facts about what possible situations exclude other 
situations will be constrained by facts about the persistence of types, that is, 
about when a whole inherits the types of its parts. 

There are four forms of persistence that seem plausible: 


1. Global mereological persistence. If a part belongs to the type, so does the 
whole. 


2. Synchronic persistence. If s belongs to the type, s E s’, and no part of 
either is causally prior to any part of the other, then s’ also belongs to the 


type. 
3. Punctual persistence. If s belongs to the type, s C s’, and exactly the 


same situations are prior to both, then s’ also belongs to the type. 


4. Non-persistence. There is no condition that guarantees that when a part 
belongs to the type, so does the whole. 


A globally persistent type represents an eternal fact (such as a modal or 
mathematical fact), or a fact that includes reference to particular individuals 
(or places) at a particular time (such as ‘Clinton was speaking at noon, July 3, 
1994’). A synchronically persistent type represents a fact in which individuals 
or places involved are specified, but not the time, such as ‘Clinton speaking at 
the White House’. A punctually persistent type represents a purely qualitative 
type, in which no particular individual, place, or time is specified, such as ‘a 
man speaking on a platform’. In the case of a punctually persistent type, the 
spatiotemporal location of the fact is fixed by the causal antecedents of the 
situation-token belonging to the type. 

In appendix A, I assume that all situation-types are globally persistent. In 
this chapter and its successors, I will relax this assumption somewhat, but I 
will continue to assume that all types are at least punctually persistent. A 
causal theory of spacetime (to be discussed in the next section) depends on the 
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model’s containing an adequate repertoire of types that are punctually but not 
synchronically or globally persistent. 


4.3.5 Identity Conditions for Tokens 


I will assume that each token has three kinds of properties essentially: its types 
(representing its intrinsic character or quality), its parts, and the network of its 
causal antecedents (representing its backward time-cone). The third assumption 
is a generalization of the Kripkean intuition that the origin of a thing is always 
essential to it. It seems plausible to suppose that a particular event could 
not have been the very event it is if either the intrinsic character of the event 
were different, or if the causal chain leading up to the event were different. In 
contrast, the subsequent course of events, causally posterior to an event, is not 
essential to its identity. The very same event could exist in different worlds, 
with different subsequent histories. 

Some such view as this seems to be implicit in our conviction that the past 
is fixed and the future is open. It is surely not the case that the type of event 
realized by the present state of the universe necessitates a particular type of 
prior history. It is metaphysically possible for an event just like the one realized 
by the present state of the universe simply to pop into existence without any 
real past. However, we remain convinced that the past is somehow necessary, 
given the present. The best way to make sense of this conviction is to say that 
the existence of the event-token of the present state of the universe necessitates 
the existence of the particular tokens in its actual history. Since past tokens are 
causally prior to present tokens, we can generalize this to the thesis that the 
token causal antecedents of any token are necessitated by it. 

If we make these assumptions, then any possible token could be represented 
as an ordered triple, consisting of a set of coherent types, a set of possible 
tokens (representing the token’s proper parts), and a causal tree of possible 
tokens, rooted in the token itself (if we use non-well-founded set theory) or in the 
immediate causal antecedents of the token. I would not want to identify a real 
situation-token with such an ordered pair. There is, however, a homomorphism 
from actual tokens to the set of such pairs. Pairs that do not represent actual 
situation-tokens can be taken as representing merely possible tokens. 

If we do not adopt the thesis of the relative necessity of causal antecedents, it 
is hard to see how we could provide clear criteria of identity for merely possible 
tokens. Suppose a and b are two non-actual tokens that share all the same types, 
but have quite different causal antecedents. What could possibly determine 
whether or not a and 6 are identical? They do not actually exist, so it seems 
implausible to say that their identity or distinctness could be a matter of brute 
fact. If we are to embrace some sort of actualism, we seem to be forced to say 
that merely possible tokens are constructed from actual tokens and types in the 
way sketched above. 

The substance of my account of the identity of tokens can be broken down 
into two claims: (1) the claim that identity of types, parts, and antecedents 
is necessary for identity (even trans-world identity), and (2) the claim that 
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these identities are sufficient for token identity. On the question of necessity, 
the claim that the sameness of both parts and antecedents are essential to a 
situation-token is an extension of the idea that the constitution and origin of a 
concrete thing are both essential to it. It is true that in natural language we 
sometimes treat event-tokens with slightly different parts and antecedents as 
identical. For example, we might say that the death of Caesar would have been 
less painful had Brutus not participated. However, such looseness in natural 
language should not be taken as settling the metaphysical issue. 

On the question of the sufficiency of the criteria, it is important to compare 
the identity criteria with the means we actually use in settling event identity 
questions. Typically, we identify two events, such as the beginning of the Civil 
War and the attack on Fort Sumter, by finding that one and the same event 
is responsible for two distinct effects. This typically involves tracing the causal 
antecedents of the effects back until in each chain we find an event-token of 
the same type at the same location in space and time. I will argue in section 
5.10.2 that spatiotemporal location is determined by the parts and the causal 
antecedents of an event-token. Thus, our practice of identifying event-tokens 
seems to take the sameness of types, parts, and antecedents as sufficient for 
token identity. 


4.3.6 The Causal Priority Relation 


The causal priority relation < cannot be identified simply with causation. In- 
stead, it represents a necessary precondition for causation. In fact, under the 
assumption of determinism, these three causal notions are interdefinable: 


e ~, the relation of causal priority. 
e >, the relation of being a total cause of. 


e ~, the relation of being an essential part of a total cause of, Mackie’s 
INUS condition: an insufficient but necessary part of an unnecessary but 
sufficient condition. 


I will assume that the causal priority relation is transitive and irreflexive. A 
token s is immediately prior to s’ just in case s is prior to s’, and there are no 
intermediate tokens between any part of s and any part of s’. 


(8 <o 8’) a (8 ~ 8!) KmArdyse(e@ C s&y ls’ & ax zk&z~<y) 


The three causal relations <, ~~», and > are such that any two of these can be 
defined in terms of the third, together with the mereological part-whole relation 
= 

e s <s' iff there is a world (a coherent and total situation) w such that s is 

a part of the mereological sum of INUS causes of s’ in w (or equivalently, 
iff s is a part of the mereological sum of minimal total causes of s’ in w). 
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ew s~s’ iff s is part of some minimal total cause of s’ in w. 


ewfsvps’ iff s < s’, s,s’ Cw, and s necessitates 3’. 


I will take the causal priority relation to be primitive and define the other 
two in terms of it, since in the following chapter I will be able to offer a definition 
of the causal priority relation in terms of mereological and modal relations. In 
this case, the first condition above imposes a minimality requirement on the 
extension of ~< in a model. 

It is also possible to define the mereological part-whole relation using only a 
primitive sum operation on situations and the causal relation >. I will discuss 
this fact further in the section on spacetime topology. 

These causal relations interact with mereology in a number of interesting 
ways. First of all, the total cause relation is closed under the operation of 
taking a part of the right term. If s is a total cause of s’, and s” C s’, then ¢ is 
a total cause of s”. This does not hold for ~+ or ~. 

For all three relations, we can take the sum of terms on the right: if s > s’ 
and s > s”, then s > (s’ LI s”), and similarly for ~» and ~. 

Both ~» and ~ are closed under the operation of taking a part of the term 
on the left. If s ~» s’, and s” Cs, then s” ~» s’, and similarly for <. This does 
not hold for the total cause relation. 

The mereological sum of two total causes of s is always a total cause of s, 
and likewise for causal priority. However, the sum of two INUS causes need not 
be an INUS cause, since one might make the other redundant. Finally, if s1 >’, 
$2 > 8”, 8; < 8", and s2 ~ s’, then (s; LI s2) b (s’ Ls”). 

The following axioms characterize the interaction between mereology and 
causal priority: 


Axiom 4.1 (c<xy&zC2)>z~<y 


Axiom 4.2 (x ~ (yUz)) — (@<yVa xz) 


Axiom 4.3 (4 x y&z Cy) > 7(z <2) 
Axiom 4.4 r<xy—>-zOy 
Axiom 4.5 (f& <y&y<z)-2~<z 


We can think of x < y as meaning that z lies entirely in the backward time 
cone (causally speaking) of y. If we model mereology by means of sets of atoms, 
defining a C 0 as true if and only if every atom in |a| (the interpretation of a of 
a set of atoms) is also in the |b| (the interpretation of b), then we can define the 
causal priority relation between tokens in terms of an underlying causal priority 
relation <stom between atoms, specifically: 


ME(ax<b) & V2z(z € lal — dw(w € |b| & z Katom w)) & 
V2(z € |b] — mdw(w € |a| & z <atom w)) & 
V2(z € Jal — 2 ¢ |b!) 
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In other words, a token a is prior to a token b just in case every atom in a 
is prior to some atom in }, no atom in ) is prior to any atom in a, and a and b 
have no atoms in common. 

Although the causal priority relation is an undefined primitive, its interpre- 
tation is highly constrained, so as to give a < b the meaning: a is prior and 
relevant to b. I stipulate that a model is proper only if, whenever a ~ b is true, 
ais part of some minimal total cause of b in some world in the model. 

Causal priority is stipulated to be transitive and irreflexive. Consequently, 
both total causation and INUS causation are irreflexive. It is easy to verify that 
total causation is transitive. INUS causality is transitive only under special 
conditions (see chapter 3). 


4.4 Constraints and Causation 


4.4.1 Token-Level Causation 


To say that s caused s’ is to say that the actuality of s was something like a 
causally sufficient condition for the actuality of s’. I do not in fact think that 
the actuality of a cause strictly necessitates the actuality of its effect. In fact, I 
think that the reverse is probably true: the identity of the causes of a situation 
is essential to its identity. However, in this chapter I will pretend that causes 
do necessitate their effects, in order to capture a deterministic conception of 
causality. 


Definition 4.1 (Token-to-Token Constraint) 


(s1 F 82) =des 
O(As, ag Aso) 


This definition can also be generalize to a relation between tokens and sets 
of tokens (assuming that our language has been enriched by some means of 
referring to sets of tokens). Token s constrains the actuality of set B just in 
case it necessitates the actuality of some member of B: 


(si + B) =aep O( Asi — dr € BAz) 
Definition 4.2 (Token Causation under Determinism) 


(81 & $2) =def 
As; & (81 <o 52) & (sil= (si F $2)) 


This definition of token causation can also be generalized to a relation be- 
tween tokens and sets of tokens: 


(s4 > B) def 
As, & Vr € B(s; <p x) & (81 |= (81 + B)) 
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The token-to-token constraint relation is one of strict necessitation: every 
world containing the first situation must also contain the second. Token deter- 
minism consists of two theses: causes must be actual, and causes constrain their 
effects to be actual as well. Given the definition of constraint, it follows that if 
s is a world, and ¢ — (s1 > s2), then both s; and sg must be actual in (parts 
of) s. 


4.4.2 Type/Type Constraints 


We can also define causal constraints between situation-types. In order to do 
so, I must first define a causal succession relation between tokens, abbreviated 
as sN3’, 


Definition 4.3 (Causal Succession) 


VaVy(aNy =aef 
Vz(z Cy © (Az & (x Xo z)))) 


Definition 4.4 (Causal Constraint on Types) 


(d|~ P) =aerp OVr((Ax & (2|= ¢) > 
dy(zNy & (y|= ¥))) 


A causally informed constraint from ¢ to w entails that every ¢-situation 
must be immediately followed by a 7-situation. 

Type constraints give rise to a distinctive form of modal logic. Since we 
are working with partial, three- or four-valued worlds, substitution into modal 
contexts is permissible only if the relevant types are strong-Kleene or Dunn 
equivalent, not just classically equivalent. (See appendix A for the details.) For 
example, ¢ and ((¢&w) V (6 & -w)) are classically, but not strong-Kleene or 
Dunn, equivalent. This hyperintensionality of causal contexts is vital to their 
use in explicating teleological and representational properties. 


4.5 Defining Causal Explanation 


A causal explanation is a relation between one token-type pair and another 
token-type pair. A pair (s,¢) causally explains a pair (s’,w) just in case s 
caused s’, and s’s being of type ¢ explains why its effect had to be of type w. 
This corresponds closely to Terence Horgan’s (see Horgan (1989) and LePore 
and Loewer (1989)) notion of quausation: s qua ¢ causes s’ qua yp. I will use 
this notion to define the causal powers of a property, and it can also be used to 
define the causal relevance of a particular instantiation of a property to other 
facts. 

My definition of causal explanation is intended to capture only the meta- 
physical core of our ordinary notion of ‘explanation’. As many have observed, 
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there are a host of pragmatic factors that enter into making something a good 
or apt explanation. Explanation in this full, pragmatic sense is typically con- 
trastive: we explain why something is ¢ as opposed to 1). Moreover, explanation 
depends on the knowledge and interests of the audience: we do not typically cite 
something that everyone knows was present, such as including the presence of 
oxygen in the atmosphere as part of the explanation of a house fire. However, 
I am aiming at a characterization of an interest-independent, non-pragmatic 
explanatory relation, one that constitutes a necessary condition of something’s 
being a correct explanation. This relation could also be thought of as a rela- 
tion of objective causation between facts, where facts are identified with pairs 
consisting of an actual situation-token and a type that it supports. 


Definition 4.5 (Causal Explanation (Fact /Fact Causation)) 


((s1 : 6) > (so: P)) =der 
Asy & 5, N52 & (dI~ W) & (s11= 4) 


Causal explanation is veridical in both terms: both s; and s_ must be parts 
of s, 5; must be of type ¢, and sq must be of type w. It is also irreflexive: 
no token-type pair explains itself. The transitive closure of the explanation 
relation would also be irreflexive, so explanatory loops are excluded. If the 
transitive closure of the ~< relation is a partial well-ordering, there cannot be 
any explanatory infinite regresses. 

As I just said, causal explanation under the deterministic conception is prov- 
ably sound: if there exists an explanation of s’s being 7, then s really is ~. In 
contrast, there is no necessity that explanation be complete — that there be an 
explanation for every type characterizing every causally consequent situation. 
Thus, the deterministic conception of causation does not by itself guarantee 
that type determinism is true. We can consider the completeness of causal 
explanation as an optional hypothesis. 


Theorem 4.1 (Soundness of Causal Explanation) 
(81: $) & (82: p)) > Ase & (s2|= p) 
Proof A trivial consequence of the definitions. 
Hypothesis 4.1 (Completeness of Causal Explanation) 
(81N sq & (s2|= )) > 
JG((s1 : b) (82: H)) 


4.5.1 Negative Causation 


Negative facts (that is, actual tokens paired with negative types that they sup- 
port) can serve both as causes and effects. Preventions involve negative effects; 
causation by omission or absence involves negative causes. There are also cases 
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(collected by Jonathan Schaffer (2001)) of positive cause/effect pairs in which 
the causal connection between the two passes through a wholly negative inter- 
mediary. We can have that A causes B by preventing a state that, otherwise, 
would have prevented B. For example, a terrorist causes a midair collision of 
two airliners by kidnapping the air traffic controller. The kidnapping causes 
an absence — the absence of the controller from his post — and this absence 
in turn causes the collision, by failing to produce the radio signals needed to 
prevent the collision. 

Negative facts exist in abundance. For every positive situation-type, there 
exists a corresponding negative type, its negation. Situation-tokens are, typi- 
cally, partial in nature. Hence, in many cases, a token s supports neither type @ 
nor its negation —¢. For this reason, we cannot simply identify supporting 7¢ 
with not supporting ¢. In many cases, perhaps in all, when a token s supports 
a negative type —¢, there is some positive type w that competes with or ex- 
cludes ¢ (#% might be a different determination of some common determinable) 
and is supported by s.2 However, we must clearly distinguish between the two 
questions: 


e Are there genuine negative types and negative facts (consisting of the 
pairing of an actual token with a negative type it supports)? 


e Are there purely negative tokens — tokens that support only negative 
types? 


If our answer to the first question were “No,” we would have to find a 
“No” answer to the second question nearly compelling, unless we were willing 
to countenance the existence of the empty token, a token supporting no types 
whatsoever (surely an implausible supposition). However, I am endorsing a 
“Yes” answer to the first question, and, consequently, I consider the second to 
be very much an open question, to be decided on scientific, and not a priori, 
grounds. 

D. H. Mellor has argued that the reality of negative causation argues strongly 
against taking concrete events as the relata of causation, since the mere absence 
of something does not constitute a concrete event. Does Mellor’s argument 
apply to my own account of token causation, of causation as a relation between 
situation-tokens? Situations are clearly a broader category than that of events: 
every event is a situation, but not vice versa. If the absences that figure in 
negative causation were mere nothings, if the corresponding relation-instances 
of causation involved the relation of causation with one or the other of its 
relata simply missing, then negative causation would refute my thesis that every 
instance of causation involves two situation-tokens. 

However, the absences involved in these cases are not mere nothings: they 
are absences of a particular sort, at a particular time and place. For example, 


2It would be a serious error to try to reduce negation to something like exclusion or compe- 
tition between types, since these notions themselves seem to involve an element of negation: 
type @ excludes w just in case it is not possible for the two to be instantiated together. It is 
best, I think, to treat negation as a primitive, indefinable relation between types. 
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in a Democritean universe, situations involving the void are every bit as much 
a part of the world as are the situations of the atoms. It is the absence of 
the controller from his post at the time immediately preceding the collision that 
caused the disaster. I see no reason to doubt that this absence is supported by 
part of the world, by a particular, concrete situation-token. Is there a token 
that supports only the absence of the controller at that place and time? This is 
a more difficult question to answer, but however we answer it, the compatibility 
of tokens as relata and negative causation is secure. 


4.6 Singular Causation 


Following Tooley, I will use “Humean supervenience” to represent the thesis 
that the facts about token-causation supervene upon occurrent facts plus the 
facts about the actual causal laws. To deny Humean supervenience is to af- 
firm the possibility of singular causation, causal connections whose existence is 
inexplicable in terms of causal laws and non-causal facts. 

My account so far is neutral on the question of Humean supervenience. How- 
ever, it does treat singular causal connection as a more basic notion than that 
of causal law. This does not preclude Humean supervenience, but it certainly 
makes this thesis an unnatural assumption to make without corroborating evi- 
dence. 

In fact, the notion of causal law does not play a central role in my account, in 
contrast to the Armstrong/Tooley tradition. I prefer to make use of modal and 
stochastic notions, rather than talking directly about “lawlike” generalizations. 

If the hypothesis of the completeness of causal explanation is true, then every 
instance of token-level causation falls under some necessary generalization at the 
level of types. This implication of explanatory completeness is important enough 
to warrant separate attention. I shall refer to it as Hume’s hypothesis, since its 
truth is entailed by the extensional adequacy of Hume’s definition of causation 
in terms of relations between types. 


Hypothesis 4.2 (Hume’s Hypothesis) If (s > s’) and (s’|= w), then there 
exists a type @ such that (s|= (¢|~ w)) and (s|= ¢). 


Hume’s hypothesis can be generalized by use of the generalized (token-set) 
causation. 


Hypothesis 4.3 (Generalized Hume’s Hypothesis) [f(s > B) and Vs' € 
B(s' |= w), then there exists a type @ such that (s|= (d|~ )) and (s|= ¢). 


These hypotheses do not entail Humean supervenience, however, since even 
if they held, it could still be the case that which token is causally connected 
to which is not determined by the combination of non-causal facts about the 
tokens plus the type-level necessities. It may be that law-like generalizations 
always presuppose some irreducible facts about token-level causal connections. 
This is especially plausible if, as I will argue in section 4.10.2, space and time 
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are themselves constructible from such token-level causal connections. Typi- 
cally, causal generalizations will make reference to the spatiotemporal relations 
between the cause and effect. 

Both Armstrong and Tooley are overly concerned about whether causal laws 
are contingent or necessary (they both insist that these laws are contingent). 
Are necessities of causal connection themselves necessary or contingent? This is 
a familiar question in modal logic. It amounts to asking whether metaphysical 
necessity is at least S4; that is, is the relevant accessibility relation transitive? 
Armstrong and Tooley are, in effect, asserting that necessity is not $4, that some 
necessities are themselves non-necessary. I am inclined to believe that most 
causal necessities at least are contingent, but, unlike Armstrong and Tooley, I 
do not see any interesting metaphysical issues turning on this question. 

Armstrong and Tooley seem to have a tendency to confuse the necessary/ 
contingent contrast with the analytic/synthetic distinction. They seem to sup- 
pose that, if some causal laws were necessary, they would have to be analytic 
as well. Since no causal law is analytic, they infer that all causal laws are con- 
tingent. However, I cannot see how we can exclude the possibility that at least 
some causal laws are necessary but synthetic. 


4.6.1 Heterogeneous Causal Explanations 


Heterogeneous explanations would include transitions from folk science to ad- 
vanced science, and vice versa. For example, one might explain the occurrence 
of nuclear fission in terms of slowly pulling a control rod from a reactor core 
(an advanced explanandum and a folk explanans). Or, one could explain the 
fragility of glass in terms of its molecular structure (folk explanandum, advanced 
explanans). 

Some of the most interesting cases of heterogeneous explanations are psy- 
chophysical and physicopsychic explanations. The conceptual incongruity of 
the two systems of classification, including the fuzziness of one and the preci- 
sion of the other, are no bar to the existence of genuine explanations. Where 
explanation breaks down is in the borderline, indeterminate cases. If John is 
a borderline case of baldness, or the clump of salt is a borderline heap, then a 
physical explanation of either fact will be problematic. 

This conclusion contrasts sharply with Jaegwon Kim’s well-known views on 
the matter. Kim embraces a principle of causal inheritance (Kim, 1993, p. 351): 


If M is instantiated on a given occasion by P, then the causal powers 
of this instance of M are identical with (perhaps a subset of) the 
causal powers of P. 


I take the ‘causal powers’ of a token to be a function of the causal explanatory 
force of the types of that token. A token that realizes a mental type has, by 
virtue of realizing that type, certain irreducible causal powers. These powers 
include powers to influence both the mental and the physical properties of other 
tokens. These mental-level powers can be supervenient on the physical-level 
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powers of the token without being identical to some subset of the latter, contra 
Kim. 


4.7 Empiricism and Modality 


Van Fraassen has argued that the sort of naive reliance on modality that char- 
acterizes my approach violates certain empiricist strictures. In particular, van 
Fraassen argues that a modal realist like me, who denies that modal facts super- 
vene on the non-modal facts, cannot solve the “inference problem” (van Fraassen 
(1987)). This inference problem concerns the rationality of accepting axiom T 
of modal logic: if necessarily ¢ , then ¢. Since I decline any attempt to define 
necessity, I cannot argue that T is an analytic truth, derivable by deductive 
logic from a set of stipulative definitions. How then can I claim that acceptance 
of T' is rationally obligatory? If I deny that it is rationally obligatory, I have 
no basis for claiming that causal explanation is sound, or that causal necessities 
constrain the actual sequence of events in the world. 

My response to van Fraassen is simply to insist that the acceptance of T is 
required by the proper functioning of the human mind, which I do not take to be 
exhausted by conformity to the demands of deductive logic. Axiom T is in fact 
always true, and necessarily so. Hence, reliance on T is highly reliable, as reliable 
as reliance on any axiom of standard first-order logic. The “inference problem” 
is a problem only for one who, like van Fraassen, is wedded to the Humean 
doctrine that the only standard of rational belief is closure under standard 
deductive logic.? 


4.8 Causal Relevance 


A key notion in my definitions of teleofunction and of modal knowledge is that 
of causal relevance. There are two ways to define the causal relevance of the 
type of a token to a type of a second token. The first way makes use of the 
INUS connective, ~». 


Definition 4.6 (Causal Relevance, I) (s : ¢) ~ (s’ : ~) if and only if (i) 
s~ 8’, (tt) (s|= ¢) and (s'|= wp), (iit) o is a natural type (relative to s), and 
(iv) for all s”, ifs~ s" and s" Cs’, thens' = 8". 


In other words, (s : ¢) is causally relevant to (s’ : w) just in case: s | ¢, 
s’ — w, ¢ is a natural (not gerrymandered) type, and s’ is a minimal token 
verifying the relation s ~» s’. Thus, mereological minimality comes into the 
definition of causal relevance twice: first in the definition of the INUS condition 
(s is an INUS cause of s’ just in case s is part of a minimal total cause of s’), 
and second, in the definition of causal relevance itself. 


3For a further discussion of this topic, as part of a general account of inductive knowledge, 
see section 19.7. 
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A second approach to the definition of causal relevance would be to define 
a relation of subtype. Type ¢ is a subtype of type w just in case every possible 
token that verifies ~ also verifies ¢. The intension of the subtype is a subset 
of the type’s own intension. Two types are identical if each is a subtype of the 
other, i.e., if their intensions coincide. 

Using subtypes, we can define a minimal explanation: 


Definition 4.7 (Minimal Explanation) ((s, : 6) >min (so: )) if and only 
if d is natural (relative to s), and for every natural type x such that (s1 : x) 
and x is a subtype of @, ((81: x) > (2: ¥)) ffx =¢. 


Finally, causal relevance can be defined in terms of minimal explanation, 
exactly as INUS causation has been defined in terms of minimal token causation. 


Definition 4.8 (Causal Relevance, II) (s : ¢) ~ (s’: w) if and only if @ is 
natural (relative to s), and there exists a type x such that y is a subtype of o 


and ((s: x) Pmin (s’: )). 


It would be worthwhile to investigate under what conditions these two defi- 
nitions of causal relevance coincide. 


4.8.1 Merely Disjunctive (Gerrymandered) Properties 


It is a commonplace of the philosophy of causation that merely disjunctive prop- 
erties, that is, properties formed by the disjunction of unrelated and heteroge- 
neous properties, cannot be causally efficacious. For instance, one can explain 
a fever by attributing the property of having the mumps, but not by attributing 
the property of having the mumps or suffering from sunstroke. The latter is not 
a natural property. Some disjunctions are not merely disjunctive: for example, 
the property of being a marsupial or being a placental mammal corresponds to 
the natural property of being a mammal. 

The real difficulty lies in finding a principled way of distinguishing disjunctive 
predicates that represent merely disjunctive properties from those that represent 
natural ones. We don’t want to rely simply on linguistic form: a type ¢@ V w 
might correspond to a perfectly natural, non-gerrymandered type, of which @ 
and w are exhaustive sub-types. The most promising strategy is to make use 
of the causal laws (or type/type constraints) in which the property figures. A 
property ¢ V w is merely disjunctive just in case, for every property x, the 
constraint (dV w)|~ x holds if and only if both ¢|~ x and y|~ x hold. However, 
this method of distinguishing merely disjunctive from natural properties fails 
in the present context, in which the constraints are held to be deterministic. 
Every disjunction would turn out, according to the deterministic model, to be 
merely disjunctive. 

An alternative characterization of merely disjunctive properties makes use 
of the mereological structure of the situation-tokens that support these deter- 
ministic constraints. 
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Definition 4.9 (Merely Disjunctive (Gerrymandered) Types) (éV7) is 
merely disjunctive (or gerrymandered) relative to situation s iff, for every type 
x, sk (PV v)\~ x), then there exist proper parts of s, $1 and sg, such that 


51 F (Gl~ x), 81 F l~ x), $2 I~ x), and 82 (4l~ x). 


Goodman’s quality of grue, for example, is a clear case of a gerrymandered, 
merely disjunctive property. Any causal constraint involving grue can be fac- 
tored into separate causal constraints, one involving blueness (and, perhaps, be- 
ing first observed after 2000) and the other involving greenness (and being first 
observed before 2000). These two separate constraints could each be supported 
by tokens that were proper parts of the token supporting the gerrymandered 
grue-constraint. 

Gerrymandered types are never causally relevant. This conclusion is imme- 
diate if we use the definitions of causal relevance given above and make the 
identification of natural types with non-gerrymandered types. This distinction 
between the causal irrelevance of grue and the (possible) causal relevance of 
green provides simple and satisfactory solution to Goodmans’s “new riddle of 
induction.” 


4.8.2 Efficacy of Dispositional Properties 


It is clear that there are dispositional states. For example, a token has the 
dispositional state of being dormative just in case it supports some type ¢, and 
also supports a causal constraint linking ¢ to the state of being asleep. If dis- 
positional states are situation-types, then they certainly satisfy the definition of 
causal relevance. If the dispositional type of dormativity, then there undoubt- 
edly exists a causal constraint linking dormativity itself (and not just each of 
its various realizations) with the state of being asleep. 

However, it is not a trivial matter to claim that dispositional situation-types 
exist. We cannot appeal to a general principle of abstraction, which would 
assert that every open formula corresponds to a situation-type. Such a general 
principle would almost certainly be logically inconsistent, since nothing would 
prevent us from forming the open formula corresponding to the property of 
heterologicality, the property of being a type that does not apply to itself: 
Az—(z|= x). This property of heterologicality would support itself if and only 
if it does not support itself, a logical impossibility. 

Whether dispositional types in general exist, or whether dispositional types 
of certain special kinds exist, is not an a priori question, but one that can only 
be settled scientifically. If the best account of the world requires that we posit 
the existence of dispositional types, then we should do so, but only if this is so. 

Ever since Moliére parodied the Aristotelian chemist, who explained the 
sleep-inducing power of a narcotic by appealing to its property of dormativity, 
many have held that the causal irrelevance of dispositional properties is an a 
priori certainty. These enemies of dispositions often appeal to Hume’s dictum 
that the relationships between cause and effect must be logically contingent. 
However, if dispositional types exist, it seems plainly wrong to deny them causal 
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efficacy, Hume’s dictum notwithstanding. For instance, Feynmann successfully 
explained the explosion of the shuttle Challenger by reference to the fragility 
of the O-rings at low temperatures. This property of fragility (if it exists) was 
certainly causally relevant to the disaster, by virtue of being causally relevant 
to the shattering of the O-rings under the actual conditions of the launch. 

It is true of course that, once we have discovered that morphine is sleep- 
inducing, or that O-rings do shatter in cold weather, appeals to dormativity or 
fragility do not provide an interesting explanation of the phenomena. In fact, 
they merely restate the explananda. However, that observation falls far short 
of establishing that dispositional properties have no causal efficacy. 

Frank Jackson (Jackson, 1996, p. 202) has argued that to allow disposi- 
tions to be causes is to “admit a curious and ontologically extravagant kind of 
overdetermination.” I agree with Jackson up to a point: to posit dispositions 
as causes is to be committed to the existence of dispositional types, and this 
commitment should be undertaken only on the basis of some positive evidence. 
I accept Occam’s razor: we should not multiply entities unnecessarily. However, 
Occam’s razor is two-edged: we shouldn’t refuse to multiply entities in the fact 
of contrary evidence. 

I disagree with Jackson (and with many others who take a similar stance, 
such as Jaegwon Kim (1997b)) in thinking that the avoidance of overdetermi- 
nation should be a factor in assessing the case for the existence of dispositional 
types. After all, what is wrong with overdetermination? We can all agree that 
where overdetermination involves some kind of unexplained coincidence we have 
good grounds for skepticism. It is unlikely that two or more sufficient causes 
would converge (for no explicable reason) on exactly the same effect. How- 
ever, the overdetermination involved in recognizing both the underlying physical 
properties and the disposition itself as causes does not involve any such unez- 
plained coincidence. There is a perfectly intelligible relationship between the 
physical basis of the disposition and the disposition itself: the logical relation- 
ship between an instance of an existential generalization and the generalization 
itself. For example, let ¢ be the chemical features of morphine that give it its 
dormative power, and let ~ represent the state of being asleep. The relevant dis- 
positional state of a sample of morphine would be the conjunction ¢ & (¢|~ y). 
I take it for granted that such conjunctions of particular chemical and modal 
properties exist. The corresponding dispositional type of dormativity would be: 


aY(Y & (Y|~ ¥)) 


The fact that a sample of morphine supports both the particular disposi- 
tional state and the general dispositional type is no coincidence. Given that 
Bonzo is a chimpanzee who is in the cage, there is no mere coincidence in the 
fact that it is the case both that Bonzo is in the cage and that a chimpanzee is 
in the cage. 

Overdetermination is troublesome only when one of two conditions is met: 
the two causes are non-overlapping and causally unrelated tokens, or the causes 
are two unrelated types of the same token. For example, if the victim is killed 
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by a volley of six simultaneously impacting bullets, there is a coincidence to 
be explained. Alternatively, if both the charge and the spin of the electron are 
unrelated, and both are causally sufficient to produce some particular quan- 
tum effect, the overdetermination of the effect. would be a puzzling coincidence. 
However, in the case at hand, neither of these cases apply. The chemical com- 
position of the Challenger’s O-rings, and the fragility of these same O-rings, are 
neither disjoint and unrelated situation-tokens, nor are they unrelated types of 
the same token. Hence, any “overdetermination” of the Challenger disaster by 
these two facts is entirely innocuous. 

In chapters 12 and 16, I will argue that we have good grounds for positing 
the existence of biological and mental dispositions as real situation-types. 


4.9 Piecemeal Causation 


In ordinary language, we sometimes describe one event as causing another, even 
though the two events overlap in time, so that parts of the cause are causally 
posterior to parts of the effect. David Lewis (Lewis, 1986b, pp. 172-173) has 
described this sense of causation as “piecemeal causation.” Using the resources 
of mereology, it is simple to define piecemeal causation: a token s piecemeal- 
causes s’ just in case every part of s’ is caused by some part of s. In this sense, 
for example, we can say that the Vietnam War caused the campus unrest of 
the 1960s, despite the fact that the war outlasted the unrest: every part of the 
unrest was caused by some part of the war. 


4.10 Desirable Features of the Theory 


At this point, I would like to review the desiderata mentioned in section 2 and 
check that the theory developed so far meets these goals. 


4.10.1 Transitivity, Asymmetry, and Veridicality 


Token causation has been defined in terms of the transitive closure of immediate 
causation, so at one level the question of transitivity is trivial, and the account 
we have given is unilluminating. 

However, on the deterministic conception, there is a modal property that 
is characteristic of causal connection: necessitation. The interesting question, 
therefore, is this: given that immediate causal connection involves necessitation, 
does it follow that the same is true of mediate causal connection? 

Of course, the answer to this question is clearly “Yes,” since the relation of 
necessitation is itself transitive. 

In the case of asymmetry, it must be admitted that, although the causal 
relation is indeed asymmetric, this is achieved in an ad hoc manner, by including 
as an essential component the presence of the causal priority relation, which 
is simply stipulated to be a partial ordering. The indeterministic definition of 
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causation proposed in the next chapter will include a definition of causal priority 
that will go some way toward dispelling this ad hocness. 

The causal relation is veridical, in both terms. In the case of the cause, this 
is achieved by stipulation, but in the case of the effect, it is a natural by-product 
of the properties of necessity. What is necessitated by an actual situation must 
itself be actual. 


4.10.2 From Causal Mereology to Topology 


I hope to be able to define some basic spatial and temporal relations in causal 
terms. I take causation itself to give the dimension and direction of time: s is 
before s’ if s is causally prior to s’. To make this condition necessary as well 
as sufficient, we must introduce counterfactuals (as I do in the next section). 
A token s is before token s’ just in case for some substance a located at s, 
and some condition ¢ on a, if a had had condition ¢, then s’ would have been 
prevented (the resulting worlds do not contain s’ as a part). 

I can define timelike and spacelike separation between situations: two situ- 
ations are separated in a timelike way just in case one is before the other. If 
they do not overlap, and no part of one is separated in a timelike way from any 
part of the other, then the separation is spacelike. 

I can use causal relations to define certain basic topological properties in 
terms of causal and mereological ones. Let’s say that two situations s; and 
Sg are cooperative just in case there is a third situation s3 that is caused by 
the mereological sum of s; and sg, yet no part of s3 is caused by either s; or 
$9. If we assume that two situations can cooperate only if they either overlap 
or are contiguous, then we can define contiguity in terms of cooperation and 
non-overlap. Once we have a definition of contiguity, we can define continuity 
and compactness: 


Definition 4.10 (Cooperation) Coo0(s1,s2) < Js3(s1 U sg > s3&-7ds84 £ 
83(S1 > 84 V $2 > 84)) 


Definition 4.11 (Contact) Ctg(si,s2) = 781 © s2 & Coo(sy, 82) 


Topological operations such as interiority and closure, and the properties 
closed and open can be defined in the usual way, using contact and the mereo- 
logical relations — see, for example, Clarke (1981), Clarke (1985), Gotts et al. 
(1996), and Asher and Vieu (1995). 

The qualitative or naive version of space and time developed in this way has 
two necessities as consequences: there can be no temporally backwards causa- 
tion, and there can be no simultaneous action at a spatial distance, since the 
direction of naive time is just the direction of causation, and naive distance is 
determined by the number of intermediate causal steps. However, when we move 
from qualitative to quantitative spacetime, from naive to theoretical chronom- 
etry and geometry, these necessities need no longer hold without exception. As 
Gregg Rosenberg has recently argued (in his dissertation at Indiana University), 


A Deterministic Model 67 


the task of constructing a metric for spacetime faces two constraints: match- 
ing the structure of naive, qualitative spacetime (with its direct correspondence 
with causation), and achieving mathematical simplicity. The need for greater 
mathematical simplicity can, in some cases, force us to accept a certain degree 
of mismatch between spacetime and causation. Consequently, the most we can 
say a priori is that temporally backward causation and action at a distance must 
be the exception rather than the rule. 

The case of quantum mechanics points out another source of discrepancy 
between causation and spacetime. It may be that the metrical spacetime that 
fits perfectly (or very nearly perfectly) the causal relations holding at a macro- 
scopic level may do a very poor job of matching the structure of causal relations 
holding at the microscopic scale. Consequently, backward causation (Cramer 
(1986)) and action at a distance may be much more common at the microscopic 
scale, so long as we insist on locating the microscopic events in the spacetime 
constructed with macroscopic events in mind. 


4.10.3 Counterfactuals and Spacetime 


A causal theory of relativistic spacetime, along the lines of Robb (1914), depends 
on the notion of a world-line, the path that a light signal would take, if there 
existed such a signal starting in a given neighborhood. Thus, in order to begin 
such a project, we would need to have a theory of counterfactuals. In this 
section, I will sketch the semantics for a standard-issue counterfactual, one in 
which the antecedent specifies both a situation-token and a situation-type, and 
the consequent attributes some situation-type. Like the semantic theory of 
Frank Jackson (1987), my account of counterfactual conditionals makes explicit 
use of the concept of causal priority. 

Intuitively, the antecedent (s : ¢) asks us to consider worlds in which token 
s has been replaced (if necessary) by a minimal token of type ¢, with no change 
to the tokens that are not causally posterior to s. As preliminaries, I need to 
define the property of being a minimal token of type ¢, and the relation of 
non-posteriority. 


Definition 4.12 (Minimal Token) (s|=min ¢) =des (8| = ¢)& Va((2|= 
¢)&(t@ Os) +2 =s8) 


Definition 4.13 (Non-Posteriority) (s>s’) =g.7 75s"(s" Cs’ & 8” < s) 
Definition 4.14 (Counterfactuals) 


((s: @)O- ) =aer 
Ardy(Az & 7A(2|= 7¢) & (yl=min 4) & (x8) & (ys) & 
O((Aa & Ay) — dz((e Uy) ~ z& (z|= y))) 


Given such counterfactuals, we can use hypothetical light signals, or hypo- 
thetical acts of measurement with a standard, rigid rod, to specify metrical 
distances. 
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4.10.4 Modal Partiality 


Thanks to the partial modal logic developed in appendix A, the account of 
causation that I have proposed in this chapter does allow for the causal efficacy 
of modal facts (including nomological and mathematical facts). Situations can 
be partial with respect to the modal types that they support, and the modal 
types that a token supports directly contributes to its causal relations. A token 
s causes token s’ only if s itself supports the modal fact s+ s’, that is, only if 
s supports O(As — As’). This means that modal facts can enter into causal 
explanations according to my definition. I will demonstrate this feature of the 
theory in more detail in chapter 7. 


4.10.5 Teleofunctional Generalizations 


Although teleological and functional explanations have been much discussed 
since the time of Plato, there is relatively little available on the question of the 
logical form of teleological laws. I conjecture that a teleological law consists of a 
certain kind of claim linking three generic or parametric facts or types: ¢ has the 
function of making it the case that w in species or natural kind v. As has often 
been noticed, teleological explanation seems to reverse the normal causal order: 
w is the final cause of ¢ in v, even though ¢ is causally prior (in the ordinary 
sense) to w. This has very often seemed paradoxical, but the air of paradox 
disappears once the logical form of the teleological claim is seen to involve the 
assertion of a higher-order causal law. To say that 7 is the final cause of ¢ in v 
is to claim that there is a higher-order causal law whose antecedent contains the 
fact that something is an instance of v and the fact that there is a causal law 
linking ¢ to w, and whose consequent is ¢ itself. Looking at the matter more 
formally, we can distinguish two kinds of teleological generalizations: those in 
which the functional attribute is causally necessary (in the presence of type v) 
for its function, and those in which the functional attribute is causally sufficient: 


Trec($s¥,v) => ([U& ((U& 79)|~ mY)! |~ ) 


Tsus (9, ¥,v) => (v& ((V& })|~ ¥)| I~ ¢) 


These two kinds of teleological generalizations represent two extreme cases. 
In the following two chapters, I will develop models of causation that incorporate 
relations of objective probability, rather than those of strict necessity. 

Teleological explanation seems paradoxical because w occurs in the antecedent 
of the causal law and ¢ occurs in the consequent even though ¢ is causally prior 
to w. This is not in fact a semantic irregularity, because y~ does not occur in its 
own right in the antecedent; instead it occurs as part of a causal conditional. 
Although ¢ is causally prior to a, it is not causally prior to (u& 7¢)|~ aw. 

Consider a concrete example of a teleological connection. Suppose we claim 
that the function of a robin’s tail is aerial stability. Let @ be the property of 
having a tail, a be the property of aerial stability, and v be the property of being 


A Deterministic Model 69 


a robin. The teleological claim consists of a claim that there is a higher-order 
causal law according to which the joint fact of something’s being a robin and 
its being the case that having a tail is a causally necessary condition of aerial 
stability is a cause of that thing’s having a tail. But there is nothing mysterious 
about such a higher-order law. Such a law is a corollary of a Darwinian theory 
of natural selection. Darwinism is best understood not as the thesis that there 
are no final causes in nature, but as the hypothesis that all final causes in 
nature are ultimately explicable in terms of reproductive advantage. Assuming 
that aerial stability is an adaptive feature of robins and that having a tail is 
indeed causally necessary (in the case of robins) for aerial stability, then this 
causal connection between tails and aerial stability is part of the explanation for 
actual robins’ having tails: had their ancestors not acquired tails, robins would 
not have successfully reproduced. 

I will return to this point in chapter 7, and I will develop a theory of teleo- 
functionality in some detail in chapter 12. 


4.10.6 Compatibility with Indeterminism 


This is the unfinished business that I will take up in the next two chapters. 


4.11 Applying the Theory to Some Examples 


In order to demonstrate what this account both can and cannot do, I will sketch 
out applications of it to four simple examples of causal setups: a finite automa- 
ton, the determination of supply and demand in a marketplace (according to 
a rarefied, microeconomic model), a Turing machine, and one-dimensional bil- 
liards. 


4.11.1 A Finite Automaton 


The basic types for a theory of a finite automata will consist of a finite set 
of internal state types, a finite set of input types, and a finite set of output 
types. These three sets can be assumed to be disjoint. Let R be the set of all 
possible run-types of the automaton, i.e., R consists of a finite or w sequence of 
input—output-internal state type triples. We can define a set of representations 
of possible tokens recursively. First, I will define a set Ant of possible token- 
antecedents by means of the following recursive definition: 


e (0,i,t) € Ant, if (¢,¢) represents a possible initial input and internal state 
of arun in R. 


e (a,i,t,0) € Ant if a € Ant, and there is arun r € R such that the inputs 
and internal states of r agree with a for the first n stages (where n is the 
recursive depth of a, and 7 and ¢t are the input and internal state of the 
n+ 1% stage of r). 


70 Realism Regained 


An atomic situation-token is defined to be a tuple of the form (a, 1), (a, i,t), 
or (a, i,t,0), where i, t and o are, respectively, input, internal state, and output 
types, and where (a,i,t,o) € Ant. The first sort of atom represents a token 
whose only type is of the input variety, the second sort represents a token whose 
only type is of the internal-state variety, and the third represents an atomic token 
whose only type is of the output variety. The set of tokens Tok can be defined 
as containing the mereological sums of all coherent sets of atoms, where a set 
of atoms is coherent if all of its members agree in their sequence of types with 
one run r € R. This construction fully determines the interpretation of C. 

It only remains to define the causal priority relation <. I will first define this 
relation only for atomic tokens. If s is a sub-sequence of a, and s’ = (a,i,t), 
then s x s’. (The internal-state tokens are causally posterior to all the previous 
input and internal state tokens.) Furthermore, 


(a, t) ~ (a,i,t) ~ (a, t,t, 0) 


Finally, we take the transitive closure of the relation so far defined. For 
complex tokens, if s = £2 € A, then an atomic token s’ is causally prior to s iff 
s’ is prior to some atomic member of A and s’ ¢ A. A complex token s is prior 
to s’ just in case all of its atomic parts are prior to s’. 

The atomic input tokens are uncaused and are not posterior to anything. 
The output tokens are causally inert (prior to nothing), and they are posterior 
to the simultaneous input and state tokens, as well as to all previous input and 
state tokens. The state tokens are prior to the contemporaneous output token, 
posterior to the contemporaneous input token, and posterior to all previous 
input and state tokens. 

The initial internal state token of a run is uncaused. Every subsequent 
internal state token is caused by the sum of the immediately preceding state and 
input tokens. Each output token is caused by the sum of its contemporaneous 
input and state tokens. 

In one sense, this setup is not deterministic, because the input tokens at each 
stage are wholly uncaused, exogenous to the system. However, I do not think 
that this is a very interesting sense of ‘indeterministic’. The input tokens have 
a location in time only insofar as they causally impinge upon one rather than 
another internal state token in the run. We could imagine that the input tokens 
are all located in ‘eternity’, having only an external relation to the time-sequence 
of the machine. Or, we could imagine that in each run, the input tokens are all 
fixed in one fell swoop at the very beginning and held in a kind of suspended 
animation until the appropriate stage. What makes the finite automaton deter- 
ministic is that for every caused token (and every causally explicable token-type 
pair), there exists a strictly sufficient, causally prior condition. 


4.11.2 Supply and Demand 


The next example is the determination of market price and quantity by supply 
and demand in the model of classical microeconomics. In this case there is 
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an infinite collection of situation types comprising a set of demand-situation 
types, supply-situation types, and market-clearing types. Each demand, supply, 
or market situation type is of one of the following forms: (1) at price z a 
quantity of goods demanded /supplied/exchanged is at least y, and (2) at price 
x the quantity of goods demanded/supplied/exchanged is at most y. A set of 
demand types is coherent if all of its constraints on demand can be satisfied by 
a monotonically decreasing demand curve, with the quantity ranging from 0 to 
oo, exclusive. Similarly, a set of supply types is coherent if it can be satisfied 
by a monotonically increasing supply curve. 

Supply and demand tokens are all exogenous: none is posterior to any other 
token. Hence, we can simply identify demand and supply tokens with the cor- 
responding types. Market tokens are prior to nothing, and each market token is 
posterior to a set of supply and demand tokens. We can define the set of atomic 
market tokens as follows: 


e If m is a market type of the form ‘at price x the quantity of goods ex- 
changed is at least y’, A is the set of demand types of the form ‘at price x, 
the quantity of goods demanded is at least z’, and B is the set of supply 
tokens of the form ‘at price x, the quantity of goods supplied is at least 
z’, for some z > y, then (m, A, B) is an atomic market token. 


e If m is a market type of the form ‘at price x the quantity of goods ex- 
changed is at most y’, A is the set of demand types of the form ‘at price 2, 
the quantity of goods demanded is at most z’, and B is the set of supply 
tokens of the form ‘at price x, the quantity of goods supplied is at most 
z’, for some z < y, then (m, A, B) is an atomic market token. 


Once again, we can let the set of tokens be the sum of all coherent sets of 
atomic tokens. 

An atomic market token (m, A,B) is causally prior to an atomic demand 
token d iff d € A, and it is causally posterior to an atomic supply token s iff 
s € B. The causal priority relation can be extended to the set of all tokens in 
the same way as in the previous example. 

It is easy to check that every market token is caused by a token containing 
both supply and demand tokens as parts. In addition, every type of every 
market token can be causally explained by the types of its causes. 

The microeconomic model illustrates the fact that there can be constraints 
that are not causal constraints. For example, if the world contains a demand 
token of the type ‘at price x, at least y is demanded’, then for every 6 there 
must be an € such that the world contains a demand token of the type ‘at: price 
a+ 6, at least y + € is demanded’. However, the first token does not cause the 
second, nor is there a causal explanation for the type of the second token. 


4.11.3 A Turing Machine 


In the case of a standard Turing machine, there are just two tape-square types: 
0 and 1. The head of the machine has types of two kinds: internal state and 
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location. There must be a finite set of internal state types, and the location 
types have the structure of the integers: ... — 2,—1,0,1,2,.... Once again, the 
atomic tokens can be identified with an atomic (simple) type, together with a 
chain of possible type-transitions leading to an instance of that type. I will 
define the set H of atomic head-state tokens recursively. 


e (0,i,t,l,0) € H, where (i, s,l,0) represents the input, internal state, loca- 
tion and output type of a possible initial state of the machine. 


e Ifa € H, then (a,i,t,l,0) € H, if (2,t,1,0) represents a possible successor 
state to the final state of a. 


Initial tape-square tokens can be identified with triples (@,j,n), where j € 
{0,1}, and n is an integer (representing the location of the square). Non- 
initial tape-square tokens can be identified with triples (a, 7,n), where the final 
segment of a is (8,2,t,7,7) for some G, i,t. 

A set of atomic tokens is coherent if it is compatible with a possible run of 
the machine. 

All of the initial tokens, both of the head and of each of the tape-squares, 
are exogenous (posterior to nothing). The causal priority relation on tokens can 
be defined in the usual way, following the pattern of the previous examples. 

The head of the machine experiences a sequence of state-tokens that can be 
identified with clock time. There is no reason to suppose that the tape-squares 
experience a synchronized succession. In the simplest model, the tape squares 
can be imagined to tunnel through time, so that when the head reaches a square 
for the first time, it is affected by the initial state-token for that square, and 
when the head returns to a square after m units of its time, it interacts with 
the very state-token that it produced on its last visit. 

We can model the causation relation in a very intuitive way. However, when 
it comes to causal explanation, the deterministic conception we have adopted 
in this chapter produces what I call “explanation inflation.” 

For example, suppose square n was written with value 7 at stage a of the 
head’s activity. Suppose that the head returns to n for the first time at stage 
a+m. Does the fact that the square had value j at stage a explain the input 
value of the head at stage a +m? It does only if it is physically impossible for 
the head to return to square n in less than m units of time, regardless of the 
state of the head or of the other tape squares. Otherwise, any explanation of 
the input value of the head at stage a + m must include enough information 
about the head and the other tape squares to guarantee that the head will not 
have returned to square n in the intervening period. 

What is lacking is the notion of being an adequate but defeasible explanation. 
I think we should say that square n’s being written with value 7 at stage a is 
an adequate but defeasible explanation of the head’s reading j when it returns 
to n at stage a+m. Information about other tape squares is relevant only if 
the head actually returned to square n in the intervening period. 
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4.11.4 One-Dimensional Billiards 


In this example, we have two elastic, circular disks on a one-dimensional runway 
in Flatland (a two-dimensional universe). We will assume Newtonian mechanics 
with no friction and no gravity or other forces. The disks are each infinitesimal 
in diameter and have one unit of mass. There are situation-types of two kinds: 
velocity and position. Each type can take any real number as its value (positive 
or negative). 

The only event (besides the uninterrupted rolling of disks) that is possible is 
the event of collision, and this can happen at most once, since after a collision 
the distance between the two disks will increase forever. 

Disk-state tokens fall into four kinds: initial state tokens, pre-collision state 
tokens, collision tokens, and post-collision state tokens. An initial token can be 
identified with an ordered pair (p, v), where p and v are real numbers represent- 
ing position and velocity, respectively. 

There are two kinds of collisions: head-on and rear-end. If the initial states of 
the disks are (p,, v1) and (pe, v2), then a collision will occur if (v1 —v2)(pe—p1) > 
0. If the value of v1 - vg is negative, then the collision will be head-on. If vz - ve 
is positive, then the collision will be rear-end. (If either v; or ve is zero, then 
the collision is with a stationary disk, which I will count as both head-on and 
rear-end.) 

A collision can be represented as a tuple (pi, v1, p2,v2), where (p1,v;) and 
(pg, ¥2) are both initial-state tokens, and (v1 — v2)(p2 — pi) > 0. 

If the collision is head-on, then it occurs at position ELVaT PA 8 | If the 
collision is rear-end, then it occurs at position *2=P2"2. After the collision, 
the first disk assumes the velocity v2, and the second disk assumes the velocity 
U4. 

A pre-collision token state can be represented as a quintuple (1, v1, p2, v2, ps3), 
where: 


e (p1,01) and (pg, v2) are initial state tokens, 
© p3 > py iff vy — v2 is positive, p3 < pz iff vy — ve is negative, and 


© pz < Pvtatpe2 if the eventual collision is head-on and v; — v2 is positive, 


and es 

p3 > Pawar P22 , if the eventual collision is head-on and vj — v2 is negative. 
© p3< paren ae if the eventual collision is rear-end and v1 — v2 is positive, 

and 

p3 > arr vieaea if the eventual collision is rear-end and vj —v9 is negative. 


A post head-on collision token can be represented as an sextuple 
(p1, V1, P2, V2, P3, U3), Where (p1,v1,p2,V2) is a head-on collision token, either 


U3 = UV) OF V3 = V2, and p3 > batatpaee if the difference between v3 and the 


other velocity is positive, and p3 < Patat Pave if the difference between v3 and 
the other velocity is negative. 
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A post rear-end collision token can be represented as an n-tuple 
(pi, U1, P2, V2, 3, U3), where (p1,v1,p2,v2) is a rear-end collision token, either 
U3 = V1 OF U3 = U2, and p3 > P2-F2™2 if the difference between vz and the 
other velocity is positive, and p3 < >+2—f22 if the difference between v3 and 
the other velocity is negative. 

Each state token has two atomic tokens as proper parts: one corresponding 
to the position of the disk, and the other to the velocity. If s is a state token, 
then (s,V) and (s, P) can represent these two atomic parts. 

A set of atomic tokens is coherent just in case every state token constituent of 
every atomic token can be fit into a physically possible pair of disk trajectories. 
The set of tokens consists of the sum of all coherent sets of atomic tokens. 

The causal priority relation can be defined as follows: 


e Initial state tokens are not posterior to anything. 


e Pre-collision state tokens are posterior to both their constituent initial 
state tokens, and to nothing else. 


e Collision state tokens are posterior to both of their constituent initial state 
tokens, and to nothing else. 


e Post-collision state tokens are posterior to their constituent collision state 
tokens, and, by transitive closure, to the constituent initial state tokens 
of these. 


I do not think that we should say that a pre-collision token is posterior to 
any of the earlier pre-collision tokens, nor that a post-collision token is posterior 
to any of the earlier post-collision tokens. The only tokens that need to have 
causal efficacy are the initial state tokens and the collision tokens. If we said 
that a pre-collision token was posterior to all earlier pre-collision tokens, then 
the causal priority relation would not be well founded in this model, since < on 
the real numbers is not well founded. 

In this simple, two-disk setup, we would not need collision tokens at all, 
since pre-collision and post-collision tokens can be distinguished by comparing 
their positions and velocities with those of the constituent initial state tokens. 
However, in more complicated setups, in which multiple collisions and multiple 
reversals of direction are possible, collision tokens are essential to a correct 
representation of the causal structure. 

In this example, we can see the inflation of both causation and explanation 
as a result of the deterministic conception of causation, despite the fact that, 
as in the last example, the setup is entirely deterministic. The cause of any 
pre-collision token, for example, must include the initial states of both disks, 
since the initial state of the disk whose state is being specified is not a sufficient 
condition, by itself, of reaching the pre-collision state in question. We must 
add the fact that the other disk was far enough away and either slow enough 
to avoid an intervening collision or headed in the wrong direction. This clashes 
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with what is intuitively correct, namely that the initial state of the other disk 
is not causally connected to the pre-collision states of the first disk. 

Similarly, according to the deterministic conception, any causal explanation 
of a pre-collision state of one disk must include enough information about the 
initial state of the other disk to ensure that a collision has not in fact taken 
place. 

This inflation becomes far worse if we consider a setup in which the number 
of disks is indefinite. In order to support a network of causal connections under 
the deterministic conception, we must add negative initial state tokens, one for 
each position on the real line at which no disk is located in the original situation. 
This non-denumerable totality of tokens will be part of the cause of every non- 
initial token, and will be involved in every causal explanation, since without 
guaranteeing that no disk has been omitted, no condition involving any set of 
disks can provide a strictly sufficient condition for any subsequent state. 


4.12 Verifying the Axioms of Chapter 3 


Now that we have a well-defined language, complete with a semantic theory and 
logic, and a series of definitions of the various causal relations in terms of that 
language, we can now go back to the intuitively plausible axioms of causation 
that I proposed in chapter 3 and see if they come out as valid, given our logic 
and our definitions. 

Here again is a list of the axioms from chapter 3: 


e Ariom 3.1plqovr(rOp-rdg) 

e Axiom 3.2 dp ¢(p) > AqVr (rOqg du (¢(u) &uOr)). 
e Azviom 3.3p=qo (pO qg&qCp) 

e Axiom 3.4 [Irreflexivity of causation] a> b > =(b E a) 

e Axiom 3.5 [Right closure under part] a> b&cl bape 


e Aziom 3.6 [Transitivity] a> b&bo> cape 


The first three of these axioms are simply the familiar axioms of mereology. 
Since our canonical models interpret C by means of the subset relation between 
supersaturated sets of formulas, it is easy to verify that these three axioms are 
indeed logically valid.* 

Since we are working in a four-valued logic, and since the causal relation > 
is not bivalent in every situation, we must replace the final three axioms with 
corresponding inference rules: 


4T here is a complication in the case of Axiom 3.2: strictly speaking, it is logically valid only 
if the open formula ¢ is so constructed as to be bivalent in every situation. This restriction 
won’t affect our uses of Axiom 3.2. 
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e Irreflexivity of causationa > bt 7(bC a) 
e Right closure under partapb&clbF ape 
e Transitivityapb&bpeckarpe 


The irreflexivity of causation is an immediate consequence of the fact that 
at implies a < b, and the causal priority relation ~ has been stipulated to be 
irreflexive. In the following chapter, I will define < as asymmetric necessitation, 
which will also be evidently irreflexive. 

Right closure under part is clearly valid, since whenever one token constrains 
the actuality of another, it always constrains the actuality of all of its parts. 

Finally, transitivity of causation follows from the transitivity of the causal 
priority relation <, together with the obvious transitivity of the necessitation 
relation. 


) 
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5.1 Beyond Determinism 


In the last chapter, we encountered a number of reasons for being dissatisfied 
with a deterministic conception of causation, both for token-causation and for 
causal explanation. The most important reason for such dissatisfaction is that 
an ideal definition of causation should not make causation impossible in an en- 
tirely indeterministic world. For all we know, the actual world is indeterministic 
through and through, and yet we can be reasonably confident that causation is 
a reality in our world. 

There are two more specific problems with a deterministic conception of 
token-causation. 


1. The Modal Inseparability Problem. If causes necessitate their effects, and 
effects necessitate their causes, then a cause and its effect inhabit exactly 
the same worlds. It seems quite reasonable to suppose that the actual 
causes of a situation are essential to its identity: it wouldn’t be the very 
token it is if its causes were different. At the same time, it seems reasonable 
to suppose that two situations that are modally inseparable are identical. 
When these two hypotheses are combined with a deterministic conception 
of token causation, we are forced into the absurdity that causes and effects 
are identical. 


2. The Over-Generation of Causal Connections. As we saw in the last chap- 
ter, a deterministic conception of token-causation makes entirely quiescent 
situations causally efficacious, so long as they fill space and time that might 
be filled by interfering situations. In short, the deterministic conception 
cannot distinguish between quiescence and action. 


It would be difficult to combine an indeterministic account of token-causation 
with a deterministic conception of causal explanation. We would have cases of 
causation without causal explanation, which would be odd, to say the least. 


77 
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There are two additional reasons for being dissatisfied with the deterministic 
conception of causal explanation: 


1. Inexpressibility and Inaccessibility of Sufficient Conditions. Even if type 
determinism were true, the types that are actually sufficient for some non- 
trivial explanandum might be inexpressible in human language or thought. 
They might involve infinite, or even non-denumerable, complexity. In 
addition, these conditions might be strongly inaccessible to observation 
or measurement, requiring, for example, absolute precision. Neither of 
these problems constitutes an insuperable objection to making reference to 
sufficient conditions in defining causal explanation, but they surely make 
it preferable to define explanation without reference to such conditions, if 
possible. 


2. The Inflation of Causal Explanations. As in the case of token causation, a 
deterministic conception of causal explanation makes negative and highly 
probable conditions, such as the absence of UFO agency of a certain kind, 
an essential part of causal explanations of mundane occurrences. The 
deterministic conception cannot distinguish between the presence of fa- 
vorable tendencies and the mere absence of contrary ones. 


5.2 Why an Indeterministic Account Is Difficult 


There are several desiderata for a theory of causation that are easily met under 
a deterministic conception, but far more difficult to satisfy when the assumption 
of the strict necessitation of effects is dropped. First of all, there are three purely 
formal properties that ought to fall naturally out of a theory of causation: 


e The veridicality of token causation and causal explanation 
e The transitivity of token causation 


e The irreflexivity and asymmetry of the transitive closure of causal expla- 
nation (no explanatory loops) 


In the deterministic account, the first two can be derived directly from the 
veridicality and transitivity of strict necessitation. In developing an indeter- 
ministic account, we must find a modal or statistical relation that supports 
veridicality and transitivity without necessitation. If, in addition, irreflexivity 
falls out of the account for free, all the better. 

There are two cases that have posed serious problems for probabilistic the- 
ories of causation in the past. These cases are: 


e Causes with negative statistical relevance to their effects 


e Preemption of causation 
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The first case demonstrates that positive statistical significance is not nec- 
essary for causal connection. For instance, a particular form of surgery may 
reliably increase the chances of survival, yet, in particular cases the surgery 
may kill. The second case demonstrates that positive statistical significance 
is not sufficient for causal connection. One event may raise the probability of 
some subsequent event without actually causing it, if some competing cause 
intervenes to break the causal connection between the would-be cause and its 
effect. 

Finally, there is an important material property of causation that is far from 
trivial to verify: the Markovian statistical independence property. A causal 
structure has the property if every effect is statistically independent, conditional 
on one of its causes, from anything screened off from it by that cause. When 
thinking about propositions or ‘events’ (in the statistical sense), this property is 
difficult to motivate and to secure. However, the ontological resources developed 
in the preceding chapters are adequate to the task. 


5.3 If Not Determinism, Then What? 


The most natural thing to do, once we have abandoned determinism, is to go 
probabilistic. We could require that a cause raise the probability of its effect 
above some fixed threshold (say 90%). We could require that the cause raise 
the probability of its effect from an infinitesimal to some finite probability. Or, 
we could simply require that the cause raise the probability of its effect, period. 

All such probabilistic relations suffer from the affliction of non-monotonicity. 
That is, situation s might raise the probability of situation s’, but the larger 
situation s Ls’ might lower the probability of s’. Similarly, the type ¢ might 
raise the probability that the next event will be of type w, but the stronger type 
@&x might lower that probability. This non-monotonicity plays havoc both 
with the transitivity of the relation and with the Markov screening-off property. 

The solution, I think, is to talk about robustly raising the probability of the 
effect. A situation s robustly raises the probability of s’, relative to world w, just 
in case both s and any extension of s in w raise the probability of s’. Similarly, 
the pair (s : ¢) robustly raises the probability that a 7 situation will follow 
(relative to w) when both ¢ and the conjunction of ¢ with any other type true 
of s raise the probability that a w situation will follow. 

To implement this idea, we would have to introduce a probability measure 
into our model structures. I will do this in the following chapter. In this chapter, 
however, I want to introduce a qualitative analogue of probability instead, partly 
because it is somewhat simpler, and partly to establish connections between 
this theory and existing work on conditional logic (in both the Stalnaker and 
Ernest Adams traditions). Consequently, in this chapter I will make use of the 
partial conditional logic developed in appendix A, where ¢0— w is taken as 
representing a probability of w conditional on ¢ that is infinitely close to 1. As I 
discuss there, this conditional is a version of Morreau’s fainthearted conditional 
(Morreau (1997)). 
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A model M of situation logic with fainthearted conditionals consists of an 
n-tuple: (Sit, Typ, |=,C, f', ft), where: 


e Sit is a nonempty set, the set of situation-tokens. 


e Typ is a nonempty set of situation-types, closed under the various logical 
and modal connectives. 


e — is a binary relation on Sit x Typ. 


e Cis a partial, antisymmetric ordering of Sit. 

e The selection functions f', f! are functions from Sit x P(Sit) into P(Sit). 
The class of atomic situation-types consists of types of the following forms: 
e ¢,,x,--- (basic atomic types) 

e (1g, O¢ (modal types) 

e sC s',s =s' (mereological types) 

e s|= ¢ (classificatory types) 

e As (actuality types) 


(@0—+ w) (conditional types) 


The first five of these forms are identical to forms used in the deterministic 
model developed in the last chapter. There are two changes: first, the condi- 
tional types are added to the list, and, second, the causal priority types are 
deleted. 

One additional advantage to an indeterministic model is this: it becomes 
possible to give a definition of causal priority in terms of modal and mereological 
properties, making it no longer necessary to treat it as an unanalyzed primitive. 
As I have already argued in the last chapter, it seems natural to treat the 
causal antecedents of a token as essential to it. Consequently, I will assume 
that the actuality of a token strictly necessitates the actuality of all of its causal 
antecedents. In an indeterministic model, we will no longer assume that causes 
necessitate their effects. Consequently, it is possible to define causal priority in 
terms of asymmetric necessitation: effects necessitate their causes, and not vice 
versa. More precisely, every part of an effect necessitates the cause, but the 
cause necessitates no part of the effect. 

However, it remains necessary to distinguish the relation of causal priority 
from the relation of whole to part. I assume that the actuality of a token strictly 
necessitates the actuality of all of its parts. However, causes and effects must 
be separate existences, as Hume observed. Thus, we can define causal priority 
as follows: 


Definition 5.1 (Causal Priority) 
(s ~ s') =gep Ve C 8’ [O(Ar > As) & O(As&—Arz)| 
& -J2(x Cs&x0 3’) 
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5.4 Causation and Causal Explanation 


5.4.1 Token-Level Causation, without Determinism 


I will insist only that a cause is quasi-sufficient for its effects. I will use the 
variably strict conditional, defined in the last section, to capture this condition. 

Token causation can be taken to imply either what must (with probability 
infinitely close to 1) follow, or what might (with finite probability) follow. I will 
define causation, >, in terms of what follows immediately. Causation in the 
ordinary sense will then be the transitive closure of immediate causation. An 
essential virtue of my account is that the same sort of probabilistic relationship 
holds in both the cases of mediate and immediate causation. 

The relation of immediate causal priority, <9, was defined in the last chapter 
as follows: 


(8 Xo 8’) ag (8 ~ 8’) & nardyndz(e@ C s&y ls’ &e~z&z~<y) 


We can now define token causation under conditions of indeterminism. I will 
give two definitions: one using strong probabilification, and the other weak. By 
strong probabilification, I mean that the probability of the effect conditional on 
the cause is robustly within an infinitesimal of 1. By weak probabilification, I 
mean that the probability of the effect conditional on the cause is robustly finite 
(not infinitely small). The first definition shall be the one that I make use of 
in the remainder of this project, but I wanted to note the existence of a weaker 
alternative that might well be appropriate in interpreting some of our talk of 
causation in natural language. 


Definition 5.2 (Token Causation — Strong Probabilification) 


(s1 > 82) =dep ($1 <o $2) & 
Ys3(s1 C sg & 83 ~<o0 $2 > (s1|= (As30- As2))) 


Definition 5.3 (Token Causation — Weak Probabilification) 


(81 Dw $2) =def (81 <0 $2) & 
Vs3(s1 CE s3 & 83 <g $82 — (s3|[= =(As30— 4As2))) 


5.4.2 Redefining Causal Explanation 


As in the case of token causation, causal explanation in the indeterministic 
model involves the property of robustness-in-the-circumstances. Roughly, a fact 
(s : @) explains a fact (s’ : #) just in case s is a token cause of s’, and there is a 
causal constraint between ¢ and w that is not overridden by any other feature 
of s. 

As in the deterministic case, I begin by defining a minimal token of a given 


type. 
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Definition 5.4 (Minimal Tokens of a Type) 


(sl=min 6) =der (SIE 6) &Va((e@ E s& (|= 6) > r= 8)) 


Causal constraint between types can then be defined in terms of the proba- 
bilistic conditional O—. A causal constraint holds between ¢ and w just in case 
the conditional probability that a given token is succeeded by a token of type 
w is infinitely close to 1, conditional on the given token’s being of type ¢. 


Definition 5.5 (Causal Constraint between Types) 


(l~ ¥) dep V2((t|=min ¢) + (ArO— Ay(x <o y & Ay & (y|= Y))) 


Finally, the causal explanation relation holds between (s : ¢) and (s’ : w) 
just in case s is a cause of y, and there is a causal constraint between ¢ and w 
that is not canceled by adding any actual property of any token prior to s’. 


Definition 5.6 (Causal Explanation) 


(3: @) ~ (s': b) =aep (8 D 8’) & (sl= 6) & (s' |= Y) & 
Ws'"Vx((s Cs”) &(s" <o 8’) & (8"/= x) > ((G& x) ¥)) 


5.5 Desirable Features of the Theory 


At this point, I would like to review the desiderata mentioned in the last chapter 
and check that the theory developed so far meets these goals. 


5.5.1 Transitivity and Irreflexivity 


There are several formal properties of causation that must be verified. First, I 
need to show that on the hypothesis of strong probabilification for immediate 
causal connection, strong probabilification also holds for mediate causal con- 
nection. Second, I must show that the same thing is true in the case of weak 
probabilification. Finally, I need to verify that mediate causal connection is 
irreflexive (and thus asymmetric). 

We can define mediate causation, >*, as the transitive closure of >. 


Theorem 5.1 (Strong Probabilification by Mediate Causation) 
The Strong Probabilification Definition entails the following, on the condition 
that s is coherent and modally complete: 


M,s - (s1 >* 82) => M,s & (As, 0— As2)) 


Proof: By induction. The base case is immediate. Assume M,s — (s1 >” 
82) & (82 > $3). It suffices to show that M,s = (As;O— As3). 

By inductive hypothesis, M,s f= (As;O— As2). We also know that M,s = 
(AsgO-> As3). Given the logic of the O—-conditional as developed in appendix 
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A, it is sufficient to show that M,s — (As.0— Asj), since that conditional 
logic supports the rule (CSO): 


CSO (¢0- w) & (YO ¢) = [(¢0 x) & (YO x)] 


That M,s -: (AsgO— As,) follows immediately from our assumptions 
about the identity conditions of situations: since $12, s; is causally prior to s9, 
and so M,s — O(Asg — As,). O(As2 — As,) logically entails (AsgO— As). 
The coherency and completeness of s are needed to ensure that we can employ 
classical conditional logic in deriving (As;Q— Asg3) from (As,;O— Ase) and 


O( Ase _ As}). QED 


Theorem 5.2 (Weak Probabilification by Mediate Causation) 
The Weak Probabilification Definition entails the following, on the condition 
that s is coherent and modally complete: 


M,s — (81 >* s2) => M,s — (As1O— Asa)) 


Proof. The proof is similar to that of the last theorem. Once again, the 
coherency and completeness of s are needed to ensure that we can employ 
classical conditional logic in deriving (As;O— As3) from (As;O— Ase) and 
O(As2 —> As). 


Theorem 5.3 (Irreflexivity of Causation) 
M, s |é (s’ D* 8") 


This theorem is an immediate consequence of the fact that the relation of 
causal priority is a strict partial ordering. 


5.5.2 Paradoxes of Causation and Statistics 


Theories of causation in probabilistic and other indeterministic settings often 
run aground on two test cases: causes with negative statistical relevance to their 
effects, and events that would have caused some effect, but were preempted by 
some other cause of the same kind of effect. 


Causes with negative statistical relevance 


A simple example of the first case would be an instance of surgery that killed 
the patient, even though the probability of short-term survival is increased by 
the occurrence of the surgery. Thus, the surgery event lowered the probability 
of death and yet caused the particular death in question. 

Let s; be the token-event of the surgery, sq the token-event of the patient’s 
subsequent death, ¢ the event-type of the form of surgery performed, and w the 
event-type of death. By hypothesis, we have that Pr(w/¢) < Pr(y/nd@). Put 
qualitatively, we may assume that we have both (@0—> 7) and 7(>¢O— 77). 
Expressed in terms of tokens, we could say that Pr(As2/As,) is quite low, or, 
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in qualitative terms, that we have (s;0—> s4), where s4 precludes sz (s4 might 
be a possible situation in which the patient survives). 

In such cases, some unusual event occurred, such as an unusual condition of 
the patient, which made the surgery especially difficult or dangerous, or some 
unusual error or oversight on the part of the surgeon. Let us call this unusual 
event s3, and its type, vy. We have then: 


Pr(b/¢&x) > Pr(b/x) 
Pr(As2/A(s1 U s3)) > Pr(As2/Ass3) 


In other words, given the existence of s3, the occurrence of s; did raise the 
probability of death. 

In this case, we cannot say that s; was a total cause of s2, that is, we cannot 
assert ($1 > 52). Instead, we have that s; and s3 are both essential parts of some 
total cause of s2. This means that we can assert that, under the circumstances, 
s, was an INUS cause of so, i.e., ($1 ~* 52), despite the fact that s,; has, by 
itself, negative statistical relevance to s9. 


Preemption 


Suppose that a medication is given to a patient that significantly raises the prob- 
ability that the patient will recover within seven days, and raises that probability 
robustly. However, the patient’s own immune system overwhelms the infection 
before the medication begins to take effect. In this case, the medication would 
have become a cause of the recovery but was prevented from doing so by the 
preemptive action of the patient’s own system. 

Let s; be the event of the administration of the medication and ¢ be its 
type, and let s2 be the subsequent recovery, with its characteristic type . We 
are to assume that the probability of » given ¢ is high, and robustly so (i.e., 
there is no actual situation s’ of type x such that Pr(y/¢& x) is low). These 
facts, however, are not enough to give us the conclusion that s, > sg. 

The missing element here is a series of tokens linking s; with sg. These 
connections are, by hypothesis, lacking: sp is connected to a series of prior sit- 
uations involving the patient’s immune system. The occurrence of s; is simply, 
in these circumstances, irrelevant to the occurrence of s2. In cases in which sit- 
uations of type ¢ (medication) actually cause a situation of type w (recovery), 
there are various intermediate situations — for example, situations of the killing 
of germs by the medication’s active ingredients — that are lacking in the case 
in question. 

In J. L. Mackie’s The Cement of the Universe (Mackie (1974)), Mackie dis- 
cusses a number of other examples of preemption. One of the most interesting 
concerns a man who sets off across the desert. The man has two mortal ene- 
mies, one of whom poisoned his reserve can of water, and the second of whom 
punctured that same can. The water in the can runs out before the man has a 
chance to drink it, and he dies of thirst. Which, if either, of the two enemies 
killed him? 
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This example illustrates the inadequacy of the counterfactual theory, or of 
any theory of necessity-in-the-circumstances in which “necessity” is understood 
in modal rather than mereological terms. Clearly, the puncture of the can 
caused the man’s death, by causing his dehydration (the immediate cause of 
the death). Nonetheless, had the can not been punctured, the man would have 
died anyway, and perhaps even sooner. 

What is crucial about this example is that if we excise the event of the 
poisoning, we have a process that is robustly sufficient to ensure the man’s 
death, and the puncturing of the can is an indispensable part of this process. 
The poisoning turns out to be a preempted, merely counterfactual cause of the 
death. 

Consider the following variation. Suppose that the first enemy, instead of 
poisoning the can, emptied it. Or, suppose he emptied the man’s source of water 
before the can was filled, replacing the water with hydrogen peroxide. In these 
cases, I don’t think we can say that the puncture caused the death. Instead, 
it was the earlier elimination of water from the series of events that killed the 
man. In the second case, the puncture might obscure from the victim the fact 
that he had set out without sufficient water, but it did not cause this state of 
affairs. 


5.5.3 Genuine Overdetermination 


Suppose that a man dies at the hands of a firing squad, with bullets simulta- 
neously hitting several vital organs. In this case, the bullets are severally and 
jointly causes of the death. Each bullet wound is a sufficient condition, causally 
prior to the death. In each case, the firing of the bullet is an indispensable part 
of the sufficient condition. Hence each firing is a total cause of the death. 

This is an example that the counterfactual and necessity-in-the-circumstances 
accounts get wrong. None of the bullets is necessary in the circumstances, since 
the man would have died anyway. Only the entire volley counts as a cause, 
according to the counterfactual theory. This erroneous conclusion arises from 
confusing mereological indispensability with modal necessity. 


5.5.4 Preemption by Trumping 


In recent work on the counterfactual theory of causation, Jonathan Schaffer 
(2000) has created an interesting variant on the idea of preemption: preemption 
by trumping. Schaffer gives an example of two wizards who simultaneously cast 
a spell, turning, say, a a prince into a frog. Wizard A uses a more powerful 
form of magic than Wizard B: whenever the two cast spells with conflicting 
implications, Wizard A’s spell always wins out. Schaffer argues, convincingly 
to me, that in the case in which both cast the same spell, it is Wizard A’s 
spell, and not Wizard B’s, that causes the transformation. Wizard B’s spell is 
preempted, not by an earlier cause, but by a trumping cause. 

We can imagine non-magical examples as well. A major and a sergeant 
simultaneously shout “Charge!” to a private under their command. The private 
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is disposed to obey orders from both officers, but whenever there is a conflict, 
he obeys the major rather than the sergeant. When both officers give the same 
order simultaneously, the major’s order trumps the sergeant’s. 

This sort of example is easily handled by the indeterministic model of cau- 
sation modeled in this chapter. We can suppose that there are defeasible con- 
ditionals linking both orders to the state in which they are carried out. The 
presence of the major giving orders is a defeater to the conditional linking the 
sergeant’s order with its fulfillment. Hence, the sergeant’s order by itself is not a 
robustly sufficient condition under the circumstances. To restore the defeasible 
conclusion, we would have to add the content of the major’s orders. However, 
the situation containing both orders is not a minimal cause of the private’s re- 
sponse: the major’s order by itself is a robustly sufficient condition. Hence, the 
major’s order is an INUS cause of the private’s response, while the sergeant’s 
order is not. 


5.5.5 Negative Causation Revisited 


As we saw in the last chapter, the deterministic model leads to an inflation of 
causes, in particular, to an inflation of negative facts as causes. The absence of 
every potential preventer of an event gets counted as an actual cause of that 
event. Moving to an indeterministic model enables us to avoid this inflation. 

However, there will still remain negative causes, even under the indetermin- 
istic model. The difference is this: not every absence of a potential preventer 
gets counted. Rather, there must exist some actual condition that would, were 
it not for some unusual absence, prevent the effect. 

Consider again the example of the terrorists who cause a midair collision 
by causing the absence of the air traffic controllers from their posts. On the 
deterministic model, we would have to count the absence from the control tower 
of any person who could have prevented the collision. On the indeterministic 
model, we must first find an actual defeater to the connection between the 
flight paths of the airliners and their subsequent collision. There is in fact such 
a defeater: the existence of the air traffic control system at the airport, a kind 
of institutional fact about that airport. Thus, the flight paths alone are not a 
robustly sufficient condition: taking into account the existence of the air traffic 
control system would lead to the expectation that a collision will not occur. To 
obtain a true total cause of the collision, we must find a defeater of this defeater. 
The actions of the terrorists, interfering with the normal operation of the air 
traffic control system by locking the controllers in a closet, would qualify. Thus, 
the absence of the controllers from their posts is part of a minimal total cause 
of the collision. The absence from the tower of some other person who might 
have prevented the collision does not so qualify, and so is not an INUS cause of 
the collision. 

The definition of causation that I gave in section 5.4 requires some mod- 
ification if it is to apply correctly to preventions. Consider, for example, the 
following sort of case. A baseball is hit toward the right field fence. The “fence” 
is actually a high, thick wall of concrete. Just beyond the fence, in a position 
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that would cross the flight of the ball in the absence of the fence, there is a 
fragile window. The right fielder catches the ball before it hits the fence. Did 
the fielder prevent the ball from breaking the window? If we apply the defi- 
nition in 5.4, we get the result that the fielder’s catch is an essential part of 
a quasi-sufficient condition of the ball’s not breaking the window. Hence, we 
seem compelled to say, contrary to our clear intuitions, that the fielder’s catch 
caused the window not to be broken. 

In order to handle this sort of case, we must modify the definition of causa- 
tion in the case of preventions, that is, when the potential effect is an absence 
(like the non-breaking of a window). In such cases, we must add a new con- 
dition: in the absence of the “cause,” there must exist a condition that would 
have been robustly sufficient for the prevented situation. Because of the pres- 
ence of the fence, there was no such robustly sufficient condition of the breaking 
of the window. Hence, the fielder’s catch should not be counted as a cause of 
the absence of the breaking. 

More precisely, we can say that s is a total cause of s’, where s’ is a negative 
situation (either a pure absence or the absence of a change), if and only if: 


1. s Xo 8’ 


2. ds"(s"” <o s’&Vs1(s" C 8, &-7(8s © 81) & 81 <p 8’ > (8"|= (Asi0—> 
—As’))) 


3. Vs1(s E 31 & 81 <o 8’ — (s|= (As10— As’))) 


It is the second condition that is new. It is not enough for s to be a robustly 
sufficient condition of s’: it must be the case that there is another situation, s’’, 
which would, but for s, be a robustly sufficient condition for the non-actuality 
of s’, that is, of the positive condition or change of which s’ is the absence. 


5.6 Example Applications 


I started the chapter arguing that the deterministic conception of causation 
over-generated causal connections and causal explanations, and I used several 
examples from the previous chapter to make this point, in particular, the cases 
of the Turing machine and of one-dimensional billiards. At this point, I would 
like to re-analyze these examples within the indeterministic model. 


5.6.1 Turing Machines 


In the immediately preceding chapter, we saw that the deterministic model of 
explanation resulted in explanation inflation: in order to explain the persistence 
of a value from one stage to a later stage, we had to introduce into the explanans 
enough information about the position of the head and the values of other 
squares to guarantee that the head not return to the square in question during 
the intervening period. Using an indeterministic conception of causation, we 
can avoid this result. 
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For example, let us impose a hyperreal-valued probability function on the 
total runs of the machine in such a way that: 


e Ifthe head never returns to the same square twice, the run is given a finite 
probability. 


e If the head returns to some square once, but to no square more than once, 
the run is given a probability of €¢, where ¢€ is an infinitesimal. 


e If the head returns to some square 7 times, but to no square more than n 
times, then the run is given a probability of the order of é’. 


Suppose square n has been visited by the head 7 times by stage a, and that 
' its value at stage a is 7. The probability that it will return for an i + 1° 
visit, conditional on the past history of square n, is infinitesimal. Thus, the 
probability, for any m, that square n will have the value j at stage a+, given 
that it has that value at stage a, is infinitely close to 1. Hence, in cases in which 
in fact the head does not return to square n during the period between a and 
a+m, we can explain the value of the square at stage a-+m by means of its 
value at stage a alone. 


5.6.2 One-Dimensional Billiards 


In the example of one-dimensional billiards, the deterministic conception of 
causation forced us to include the initial state of one disk as a part of the 
cause of any of the pre-collision states of the other disk. This involved an over- 
generation of causal connections, since, intuitively, the initial state of the first 
disk is causally connected to the states of the second disk only after a collision 
event. 

In order to apply the indeterministic conception of causation to this example, 
we must stipulatively define an appropriate probability assignment to possible 
total histories. Let us assign a finite probability to a history just in case no 
collisions occur in that history. If we are dealing with a setup in which there 
are more than two disks, and thus the potential for more than one collision, we 
can stipulate that the probability of n + 1 collisions is infinitely smaller than 
the probability of n collisions, for every n. 

Given these stipulations, it is possible to describe the setup in such a way 
that the only cause of the pre-collision states of a disk is the initial state of 
the disk, since the probability of a collision is infinitesimal. More generally, the 
causes of the state of a disk will include the initial state of that disk and of any 
disk that has collided with that disk, and the initial state of any disk that has 
collided with any of those disk, etc. Initial states of disks that have not yet 
interacted with the disk in question, either directly or indirectly, need not be 
counted as part of the cause of that disk’s present state. 
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5.6.3. Mackie’s Slot Machines 


I would like to conclude this chapter by examining the indeterministic slot ma- 
chines L and M, introduced by J. L. Mackie in his book The Cement of the 
Universe (Mackie (1974)). In slot machine L, the insertion of a shilling coin is 
necessary, but not sufficient, for the production of a chocolate bar. In machine 
M, the insertion is sufficient, but not necessary. 

In the case of machine L, it is clear that every coin-insertion token is causally 
prior to the subsequent chocolate-bar-output token, if there is one. If we assume 
that the insertion of the coin raises the probability of the chocolate bar’s being 
produced to a finite level, then this example conforms to the weak probabili- 
fication definition of token causation. Since machine L seems a clear case of 
causation, as Mackie asserts, this example provides some reason for thinking 
that only weak probabilification is necessary for causation. 

In the case of machine M, the insertion of the coin is sufficient for the pro- 
duction of a chocolate bar, but sometimes a bar is produced spontaneously. 
Suppose s is a particular token of coin insertion, and suppose that it is immedi- 
ately followed by an event s’ of chocolate-bar production. [fs is causally prior 
to s’, then clearly s is a cause of s’, since the existence of s’ is necessitated by 
the existence of s in this case. However, it is not clear that this relation of causal 
priority actually holds. Suppose there was a probability of p that a chocolate 
bar would be produced spontaneously at the very time of the occurrence of s’. 
It would appear that the probability that s’ is actually posterior to s is only 
1—p, with a preemption of the coin-bar connection occurring with a probability 
p. Depending on how the details of the example are filled in, it may or may not 
be possible to decide empirically whether the causal connection to s is actual 
or preempted. 


6 


A Probabilistic Model 
of Causation 


In this chapter, I will develop a quantitative, probabilistic model of causation, 
building on the work done in the preceding two chapters. As before, the desider- 
ata for the model include securing the transitivity of causation and the statistical 
independence properties associated with the Markov properties. At the same 
time, I want to faithfully represent the full complexity of the relationship be- 
tween causation and probability, including the possibility of causes with negative 
statistical correlation to their effects, the possibility of independent overdeter- 
mination, and the possibility of cases of the preemption of causality, in which 
positive correlation is not sufficient for causal connection. 

To keep things as simple as possible, I will assume that all situation-tokens 
in all models of probabilistic causation are probabilistically complete. I will 
include a set W of coherent and complete situation-tokens that will constitute 
the possible worlds of the model, and I will assume that to each token is as- 
signed a standard probability function over the set of worlds. I will not try to 
accommodate modal or stochastic partiality in this chapter. 


6.1 Models 


A probabilistic model M shall consist of a tuple: (Sit, W, Typ, |=, C, ), where: 


e Sit is a nonempty set, the set of situation-tokens. 
e W is a non-empty subset of Sit, representing the possible worlds. 


e Typ is a nonempty set of situation-types, closed under the Boolean oper- 
ators V and 7. 


e |= is a binary relation on Sit x Typ. 


e C is a partial, antisymmetric ordering of Sit. 
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e wis a function from 4% into the interval [0,1], where ¥ is a set of subsets 
of W. The set 1, the set of measurable events, must be closed under finite 
union, intersection, and complement. 


For simplicity’s sake, I will assume that W is finite, and that % = e(W). The 
measure function yz can then be extended to any set of worlds A by stipulating 
that: 


weA 
The measure function yz can be used to define a conditional probability 
function Pr defined on all pairs of non-empty subsets of Sit, where: 
u({w:Vs € A(s Cw) &Vs € B(s Cw)}) 
w({w: Vr € B(sCw)}) 
I will use Pr(s’/s) to abbreviate Pr({s’}/{s}), and Pr(s’/s, s') to abbrevi- 
ate Pr({s"}/{s, s’}). 


Pr(A/B) =a 


6.2 Token Causation 


I will take the definitions of token constraint and token causation from the last 
chapter and simply introduce a new parameter r, representing the probabilistic 
strength of the connection. 


Definition 6.1 (Weighted Causal Constraint) 
(s\~, 8’) =aex (8 Xo 8’) & Pr(s'/s) >r 
Definition 6.2 (Weighted Token Causation) 


(8 Dr 8) =des 
As & As’ & (s Xo 8’) &Vs""(s C 8” & (8” <q 8’) > (8"|n, 8’)) 


6.3. Weighted Causal Constraints on Types 
I will now extend the definitions of causal constraints on types to the proba- 
bilistic setting. 

Generalized weighted constraints between types can now be defined in terms 
of objective probability. 
Definition 6.3 (Weighted Causal Constraints) 


(O\~r b) Sader V2((2|=min ) > Pr(dy(x <o y & Ay & (y= ))/Ax) 2 1) 
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6.4 Probabilistic Explanation 
I will define explanation as a form of robust objective chance. 


Definition 6.4 (Immediate Probabilistic Explanation) (s; : ¢) immedi- 
ately explains (so: w) to degree r if and only if: 


(s1 D 82) & (s1|= ¢) & (so|= vw) & 
VsVx((s1 & 8) & (8 Xo 82) & (s|= x) > ((@& x) |r Y))) 


6.5 Examples 


In this section, I will illustrate how this model can be applied to several well- 
known puzzles. 


6.5.1 Risky Surgery 


A risky form of heart surgery, S, is known to be fatal in some circumstances and 
is performed only when the chances of imminent death are very high. Overall, 
the surgery increases the chances of the patient’s imminent survival. The diffi- 
culty here is to give a probabilistic explanation of a case in which the surgery 
was fatal, despite the fact that the surgery decreases the probability of imminent 
death. Let us suppose that the surgery is uniformly performed in circumstances 
in which the probability of death would otherwise be 75%, and that the proba- 
bility of imminent death, given the performance of the surgery, is 50%. 

There are a number of ways in which this could be done. First, it could be 
that there are certain conditions of the patient that cannot be determined until 
the surgery is begun, but that, in combination with the surgery, greatly increase 
the probability of death. Call such a condition C. Suppose that the probability 
of death given C and no surgery is 80%, and that the probability of death given 
the conjunction of C' and surgery is 90%. Let us suppose that this connection is 
robust in this case: there are no other actual factors that would, in conjunction 
with C and S, lower the probability of imminent death below 90%. In such a 
case, the conjunction C & S explains, to degree 0.9, the subsequent death of the 
patient. When this S is an indispensable component of this explanation, it does 
cause the death, despite the fact that S alone would support a probability of 
death of only 50%. 


6.5.2 The Pill and Thrombosis 


Another example of a paradoxical situation is provided by the relationship be- 
tween the use of the birth control pill and the occurrence of thrombosis. There 
is some direct effect of the pill on the clotting mechanism, increasing somewhat 
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the chances of thrombosis. However, the pill prevents pregnancy, and preg- 
nancy has an even greater effect on clotting, leading to an even greater chance 
of thrombosis. The difficulty lies in providing a probabilistic explanation of the 
onset of thrombosis in terms of the use of the pill in a particular case, despite 
the fact that the use of the pill lowers the probability of thrombosis by lowering 
the probability of pregnancy. 

In this case, there is some intermediate factor between the use of the pill and 
the onset of thrombosis; however, this is not essential to the example. So, let’s 
suppose that the pill’s causation of thrombosis is direct, and its suppression of 
thrombosis is indirect (via the prevention of pregnancy). Let us say that the 
probability of thrombosis, in the absence of some causal factor, is negligible. 
Let’s suppose that the probability of thrombosis given the pill is 1%, that the 
probability of thrombosis given pregnancy is 2%, and that the pill is 100% 
effective in preventing pregnancy, which otherwise, in the circumstances, has a 
10% probability of occurring. 

In a case in which the pill causes thrombosis, we can give a probabilistic 
explanation of this fact in terms of the use of the pill: the use of the pill explains 
the onset of thrombosis to degree 0.01. This connection is robust: there are no 
other factors we can add that would bring the probability below 1%. The use 
of the pill is an indispensable part of this explanation, since the probability of 
the occurrence of thrombosis, in the absence of any causal factor, is negligible. 

It is also true that in cases in which the pill prevents thrombosis through 
preventing pregnancy, we can also give a robust probabilistic explanation of 
this fact. The use of the pill explains the non-occurrence of pregnancy to de- 
gree 1 (since we have assumed, unrealistically, that the pill never fails), and the 
non-occurrence of pregnancy, together with relevant facts about the patient’s 
sexual activity, explains the non-occurrence of thrombosis to degree 99%. What 
makes the use of the pill indispensable in this case is the requirement of ro- 
business. We could, without reference to the pill, find actual conditions that 
made the non-occurrence of thrombosis much more likely than 99%. These 
conditions would not constitute an explanation of this particular case of the 
non-occurrence of thrombosis, because of a failure of robustness. The addition 
of actual tokens, supporting the occurrence of sexual activity, would bring the 
probability of non-occurrence of thrombosis below 99%. Thus, the use of the 
pill is an indispensable part of a robust explanation of this particular case of 
the prevention of thrombosis. 


6.5.3 A Hole in One the Hard Way 


Deborah Rosen (as cited by Patrick Suppes (Suppes, 1984, p. 41)) created 
another example of causation with negative statistical relevance: making a hole 
in one “the hard way.” A golfer makes a shot that slices badly, striking a tree 
limb, by which it is deflected directly into the hole. Hitting the ball as badly as 
the golfer did significantly lowers the probability of a hole in one, as compared 
to making a competent stroke in the first place. Hitting the tree limb, let’s say, 
also lowers the probability of the hole in one. Nonetheless, in this particular 
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case, the golfer’s swing and the ball’s striking the limb are clearly both causes 
of the hole in one. 

We have three situations: s;, the golfer’s bad stroke (supporting type ¢), s2, 
the ball’s hitting the limb, and s3, the ball’s falling into the hole (type ~). The 
intermediate situation-token supports a number of types, of varying degrees of 
specificity. There is the type x, which gives us the bare information that the 
ball struck the limb, and, at the opposite extreme, there is the type p, which 
includes complete information about the velocity and mass of the ball and the 
shape and elasticity of the tree limb. 

At the level of tokens, we can be confident that there is some situation s4q, 
containing s; as a part, together with information about the wind, the ball, 
and so on, such that we have, for some finite r, the weighted causal constraint 
S4|~, 83. In turn, there is some larger situation ss, of which s3 is an essential 
part, and some finite g such that we have s5|~, sg. The information in s, 
(the swing) and in s3 (the collision with the tree branch) is surely essential to 
robustly supporting the objective chances of r and q respectively. Hence, both 
$, and s3 are INUS causes of the hole in one. Even if r and g are quite small, it 
seems clear that, in the absence of information about the swing, the objective 
chance of hitting the limb would be much lower (perhaps zero, since golf balls 
don’t spontaneously leap across the course), and similarly, in the absence of 
information about the collision with the limb, the objective chance of the hole 
in one (given the slice) would be much smaller than gq. 

At the level of types, we know that the type ¢ (the bad slice) is negatively 
correlated with the type of a hole in one (7). We are also supposing that 
the relatively underspecified type of a collision with the tree limb (x) is also 
negatively correlated with the hole in one. Nonetheless, there is a weighted 
causal constraint linking the stroke-type @ with the very specific collision-type 
p, and another weighted constraint linking p with the hole in one-type #. Thus, 
we do have immediate causal explanations linking ¢ to p and p to w. The bad 
swing-type, ¢, can therefore figure in a mediate causal explanation of the hole 
in one, despite the lack of a positive statistical relationship between ¢ and %. 

It is important to bear in mind that causation at the token level is a precondi- 
tion of fact/fact causation, or causal explanation at the type level. If there were 
no intermediate situation s3 actually instantiating type p, the merely generic 
links between ¢ and p and between p and w would have nothing to do with 
providing an explanation of the fact that sq instantiates w. 


6.5.4 Mishap at Reichenbach Falls 


I. J. Good’s example (as cited by Hitchcock (1995)) of a mishap at Reichenbach 
Falls involving Sherlock Holmes, Dr. Watson, and Professor Moriarty is of a very 
similar structure to that of the Rosen hole in one. In this case, Watson sees 
Moriarty about to push a boulder in a direction that will almost certainly result 
in its crushing Holmes. Watson preempts Moriarty by pushing the boulder in 
another direction, thereby lowering the probability of Holmes’s death. Unfor- 
tunately, the boulder takes an unlikely path down the falls, crushing Holmes 
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despite Watson’s good intentions. 

Once again, we have a cause (Watson’s pushing of the boulder) which actu- 
ally lowered the probability of its effect (Holmes’s death). For simplicity’s sake, 
I will ignore in this case the intermediate events and look simply at s;, Watson’s 
pushing of the boulder, and s2, Holmes’s unfortunate death. The situation s, is 
an essential part of some larger situation s3, supporting information about the 
shape and mass of the boulder and the contour of the cliff side. There is some 
objective chance q linking s3 to s2. We are assuming that q is quite small, cer- 
tainly smaller than the probability of Holmes’s death had Moriarty pushed the 
boulder. Nonetheless, g is finite, and the inclusion of Watson’s push (s,) in the 
total cause s3 is surely essential. Given only the character of the boulder and 
the contour of the mountain, the objective chance of Holmes’s death is much 
lower even than q. 

On my account, we do not look at what would have happened in the absence 
of the cause. The presence and intentions of Moriarty are irrelevant to the 
question of whether Watson’s push was an INUS cause of Holmes’s death. We 
evaluate the contribution of Watson’s push mereologically, not counterfactually. 
We see if we can find a situation that does not include Watson’s push that 
supports an objective chance as high as that supported by situations that do 
include Watson’s push. 


6.5.5 Cartwright’s Poison Oak Defoliant 


Nancy Cartwright constructed the following example. A gardener had to choose 
whether to buy a defoliant that is 99% effective or a cheaper one that is only 
90% effective. She chose the cheaper defoliant and sprayed a patch of poison 
oak with it. The poison oak survived. Was the spraying of the poison oak with 
the cheaper defoliant a cause of its survival? In my view, the answer is clearly 
“No,” and this is the result we obtain by applying my model. There is some 
situation involving the hardiness of the poison oak that supported an objective 
chance of the plant’s survival of at least 10%. This situation need not include 
any information about the defoliant. Conversely, there is no situation robustly 
supporting a higher chance of survival that contains the spraying event. 

It is true that the gardener’s buying the cheaper defoliant was a cause of 
the plant’s survival, since it was presumably her buying of the cheaper defoliant 
that caused her not to buy and use the more expensive one. If we assume that 
there was at some point a chance of the gardener’s using the more effective 
spray, and that the choice to buy the cheaper spray precluded that use, then 
the decision to buy the cheaper one was clearly an INUS cause of the plant’s 
survival. However, the gardener’s use of the cheaper spray was not a cause. The 
plant survived despite, and not because of, that spraying. 
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6.6 Humphreys’s Explanation 


In The Chances of Explanation (Humphreys (1989)), Paul Humphreys offers the 
following definition of a direct contributing cause: 


B is a direct contributing cause of A just in case: 
1. A occurs; 
2. B occurs; 


3. B increases the chance of A in all circumstances Z that are 
physically compatible with A and B, and with A and Bo (where 
Bo is the neutral state of system B, i.e., Pr(A/BZ) > Pr(A/BoZ), 
for all such Z; and 


4. BZ and A are logically independent. 


The first thing that jumps out from the definition is the fact that the capital 
letters A and B are being used ambiguously, sometimes to refer to situation- 
tokens (as in conditions (1) and (2)) and sometimes to refer to types, as in 
conditions (3) and (4). Fortunately, it is relatively easy to disentangle the 
ambiguity in a way that clearly conforms to Humphreys’s intentions. 

Second, Humphreys’s definition depends on the problematic notion of the 
neutral state of a system. In some cases, this is relatively easy to determine: 
absolute zero for temperature, rest for relative velocity, etc. In other cases, it 
is apparently impossible. What is the neutral state of human intelligence or 
personality type? What is the neutral state of sex? The application of situation 
theory, with its partial, three-valued interpretations, offers an attractive alter- 
native. We can insist that type A contribute positively to the chances of type B 
both alone and in combination with any actually realized type in a given token. 


Definition 6.5 (Humphreys’s Explanation) (s : ¢) ~y (s': w) iff sP 8’, 
and for all types x such that s|= y and ¢ is not a subtype of x, the objective 
probabilities are such that Pr(w/x) < Pr(w/(x &¢)). 


This definition preserves the desirable features of Humphreys’s explanation. 
For instance, if (s : @) and (s : x) are both Humphreys’s explanations of (s’ : ), 
then so is (s : (¢&x)). 


7 


Higher-Order Causation: 
Modal Facts as Causes 


I have two reasons for being interested in a theory of higher-order causation, 
that is, a theory of how modal and causal facts can themselves cause concrete 
situations. First, my account of teleological or functional causation and expla- 
nation will be explicitly higher order. Second, I want to build a causal theory of 
modal and mathematical knowledge, which obviously depends on the possibility 
that modal facts can be part of the cause of mental states and processes. 

I have been careful in the preceding chapters to allow for the construction 
of three-valued and four-valued semantics for all of the modal elements used 
in defining causal and explanatory relations. Consequently, we can look for 
varying degrees of modal partiality in order to determine just how many, and 
which particular, modal facts are needed in deriving a given causal consequence. 


7.1 A Problem with Higher-Order Causation 


In arecent paper, Hitchcock (1996) uses Ellery Eells’s (Eells (1991)) definition of 
causal relevance to defend the intelligibility of higher-order causation. According 
to Eells, a property ¢ is (positively) causally relevant to property 7 in population 
p just in case the objective probability Pr(w/¢&n) is strictly greater than 
Pr(w/-¢ & n), for all homogeneous background contexts 7. In applying Eells’s 
definition to higher-order causation of the kind employed in the definition of 
teleology, we must suppose that ¢ is itself a property involving causal relevance. 
For example, Wright’s definition of ¢@’s having y~ as a function would, when 
translated into Eells’s definition of causation, come out as something like this 
(ignoring the background contexts for the sake of simplicity): 


Pr(o/[Pr(b/o) > Pr(b/-¢)]) > Pr(¢/[Pr(v/¢) < Pr(b/-4))) 
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This account depends on making sense of higher-order objective chance, in 
particular, of making sense of the present objective chance of w given @, and of 
wy given -@, being other than they actually are. As Hitchcock notes, it is very 
hard to see how to make sense of the present objective chance of any present 
objective chance being either 1 or 0. In the present state of the world, whatever 
factors that determine objective chance are either definitely present or definitely 
absent, so the actual objective chance of any proposition is fully determined. 

Hitchcock attempts to circumvent these problems without resorting to sit- 
uation theory by introducing the parameter of populations. He suggests that 
we treat the objective chance of y given ¢, and of w given —¢, as properties 
of various actual and hypothetical populations. The claim about higher-order 
causation is then taken to be a claim about a super-population, whose indi- 
vidual members are actual or hypothetical populations. However, Hitchcock 
has merely sidestepped the problem. To make sense of this solution, we must 
know two things: (i) which hypothetical populations to include as members of 
the superpopulation, and (ii) what probability measure over these hypotheti- 
cal populations to use in computing the higher-order probability. To have a 
principled solution to these two problems, we would have to know the objective 
chance of the various objective chances represented in the hypothetical popula- 
tion. However, it was exactly the unavailability of such higher-order objective 
chances within the conventional possible-worlds approach that led to the im- 
passe described above. 

In fact, any Humean account of causation will be unable to sustain the 
possibility of higher-order or vertical causation. At bottom, Humeans are anti- 
realists about modality. In the place of irreducible modal facts, they accept only 
regularities in the appearance of occurrent qualities. Since these regularities are 
not themselves instantiated in particular events, there can be no regularities of 
regularities and, hence, no higher-order modalities. 

The modal realist can avoid this collapse of higher-order objective chance to 
triviality by considering partial worlds or situations. A situation is partial, so 
many of the factors that determine objective chance are undetermined in a given 
situation. We can, then, sensibly talk about a hierarchy or cascade of objective 
chances. Meaningful higher-order objective chance could exist whenever there 
are well-defined objective chances of certain factors, whose presence or absence 
would, in turn, determine the objective chance of other factors. 

However, there is, as we shall see, another approach within situation theory 
to defining the causal relevance of facts about objective chance, an approach 
that does not depend on making sense of higher-order objective chance. Rather 
than asking how the objective chance of ¢ depends on the objective chance of 
given @, or of % given —¢, we can instead ask whether “deleting” facts about the 
causal connections between ¢ and w from particular situations leaves enough 
facts behind to enable those situations to cause the relevant instances of ¢. This 
talk about “deleting” facts from situation-tokens is metaphorical. We start with 
a token that supports the causal connection between ¢ and w, and then we 
consider proper parts of this token that do not support this connection and ask 
of these parts whether they support enough facts to enable them to count as 
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causes of the instances of ¢ in question. In this case, it is the indispensability 
of facts about causal connections as incorporated in parts of actual situation, 
rather than the probabilistic relevance of those facts to abstract properties, that. 
determines the existence of a causal connection. 

In this chapter, I will demonstrate that the claims I will make about the 
possibility of higher-order causation in part II — chapters 12 (teleology), 15 
(logical and mathematical cognition), and 16 (mind) — can be supported by 
the model of causation developed in this part. 


7.2 Modal Facts as Causes 


Modal facts can themselves act as causes. Suppose that s is a minimal cause 
of s’, that is, no proper part of s is a cause of s’. According to the definition 
of causation, s itself must support the modal fact O(As — As’). Any part of s 
that does not support this modal fact must be a proper part, and so must not 
be a cause of s’. 

If we assume a principle of strict downward monotonicity, it follows that any 
type supported by a minimal cause of a token is causally relevant to any type 
supported by that token. 


Hypothesis 7.1 (Strict Downward Monotonicity) If(s>s’), ands, Cs’, 
then there exists an Sq such that so C s and soP 8}. 
If (sb B), andC CB, then there exists an s’ such that s’ Cs ands’ BC. 


Strict downward monotonicity entails that if s is a minimal cause of s’, then 
s is not a minimal cause of any proper part of s’. If s is a minimal cause of s’, 
then it is certainly part of a minimal cause, and so s is an INUS cause of s’, 
s~ s’. If strict downward monotonicity holds, then s is not an INUS cause of 
any proper part of s’. This means that (s : @) is causally relevant to (s’ : w), for 
any ¢ and w such that s|= ¢ and s’|= wy. In particular, in the case above, s’s 
being of the modal type O(As — As’) is causally relevant to every type of s’. 

Nomic facts can also be causally efficacious. In the case above, by the defi- 
nition of >, s must support the causal constraint s|~ s’. If Hume’s Hypothesis 
applies to this case, then there must be a type ¢ such that s supports both ¢ 
and the causal constraint ¢|~ ~. By the definition of causal relevance, we have 
that the causal-constraint type ¢|~ ~ supported by s is indeed causally relevant 
to the explanation of s’ and its type 7. The truth of the causal constraint at 
s is an indispensable part of the explanation of the actuality of an immediately 
posterior situation of type ». 

To make this concrete, suppose that s is an event of the collision of a pair of 
billiard balls with specific velocities. The relevant physical type of s (represent- 
ing the masses and velocities of the two balls, as well as their impenetrability 
and elasticity) is ¢. The causal constraint ¢|~ y is a special case of the laws 
of conservation of energy and momentum. This nomic fact is causally relevant 
to the subsequent velocities of the balls (represented by ~). Since the type w is 
observable, our perceptual faculties belong to a causal chain including particular 
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nomic facts. Such causal connections make possible reference to and knowledge 
of such laws of nature. 


7.3 The Causal Relevance of the Excluded 
Middle 


The type (¢ V -¢@) is a paradigm case of a merely disjunctive or gerrymandered 
property (see section 5.8.1). Merely disjunctive types are never causally relevant, 
since if ((¢ V ~d)|~ x) is a causal-constraint fact in a situation s, then (¢|~ x) 
and (3¢|~ x) will each be supported by separate, proper parts of s. Instances 
of the law of excluded middle are always heterogeneous disjunctions and, hence, 
never represent natural types. 

However, although (¢V—¢) may never be causally relevant, the same cannot 
be said of the type O(¢V-¢). Suppose that token s supports the following types: 


e O(¢ V4) 
((e&x)|~ ¥) 
((n¢ & p)l~ ) 
ox 


°?p 


Let us assume that s does not support any other relevant types; in particular, 
let us assume that it does not support ((=¢ & x)|~ #), or ((@& p)|~ W). Token 
s does support the type (x &p)|~ w), since this follows from the first three 
types. However, let us assume that s supports (y &p)|~ w) only because it 
supports the first three types. That is, let us assume that any proper part 5; 
of s that does not support all of the first three types above does not support 
(x & p)|~ ¥). 

Given these types, it follows that s constrains the actuality of a succeeding 
token of type w. To be more precise, s must constrain the existence of a set B 
of types, each of which is immediately posterior to part of s and each of which 
supports the type #. If we assume, as seems reasonable, that s as a whole is 
causally prior to each member of B, it follows that s is a cause of B, s> B. 

Under these assumptions, we can show that s is a minimal cause of B, if 
we also assume Hume’s hypothesis. Suppose that s’ is a proper part of s, one 
that does not support one or more of the types listed above. Since s does not 
support any other relevant types, neither can s’, one of its proper parts. 

Suppose, for example, that s’ supports only the following four relevant types. 
It is easy to check, in a four-valued model, that these types are not sufficient to 
guarantee the actuality of a succeeding token of type wv: 


© ((d&x)|~ ¥) 
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© (7d & p)|~ ¥) 
°x 
ep 


By our earlier assumption, s’ does not support the type (vy & p)|~ w). Let s” 
be a situation accessible to s’ that supports both y and p, but is not succeeded 
by any token supporting 7%. Given the support by s’ of the first three types 
above, this entails that s” falsifies both ¢ and 7¢@ (i.e., both —@ and ¢ are 
supported by s”). This is possible, since s’ does not support the modalized law 
of excluded middle. The existence of s’’ demonstrates that s’ cannot be a cause 
of B, since every member of B supports 7. Consequently, s is a minimal cause 
of B. 

As before, strict downward monotonicity entails that every type supported 
by s is causally relevant to every type supported by B, in particular, to type w. 

Although I have made use, in this argument, of Hume’s hypothesis and the 
hypothesis of strict downward monotonicity, it is not essential to assume that 
these hypotheses hold universally. All that I need is that they hold in some 
cases of the appropriate kind. 

For a concrete illustration, suppose that s represents a situation in which a 
rabbit is pursued by a pair of predatory animals, y representing the presence of 
predator P,; and p representing the presence of predator P2. Let us suppose that 
predator P; does not yet perceive the rabbit, but will immediately perceive and 
devour the rabbit if the rabbit makes any sudden movement (¢). In contrast, 
predator P; has the rabbit within its perceptual field and will devour it unless 
the rabbit makes a sudden movement, in which case P) will lose track of the 
rabbit’s location. The rabbit notices predator P, and, consequently, makes a 
sudden movement (@), resulting in its demise w, in this case, due to the actions 
of predator P,. 

My argument is that in this case, the situation s, which records the necessity 
of the disjunction ¢ V 7¢, plus the two causal constraints, plus the facts x and 
p, is in every sense a cause of the rabbit’s demise, and the inclusion in s of the 
modalized logical truth is causally relevant to the result. 

This result can be generalized to any validity of classical first-order logic, 
by simply substituting the validity for the law of excluded middle, and adding 
causal constraints that interact appropriately with the logical validity. 


7.4 First-Order Teleological Causation 


Suppose that the fact that wings are causally relevant to flight is part of certain 
tokens that cause the successful survival and reproduction of a species v of flying 
bird. The successful survival and reproduction of v is, in turn, causally relevant 
to the existence of a present-day winged thing, namely, an instance of v. Thus, 
the existence of an instance of wingedness is explained, in part, by the causal 
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relevance of wingedness to flight. This gives us the initial, Wrightian condition 
for saying that flight is the function of wingedness as instantiated in this case. 

We can draw a distinction between intrinsic and extrinsic functions. For 
example, the bird of a wing exists for the sake of flying, and this is a case of 
intrinsic purpose. In contrast, seeds serve the purpose of feeding the bird: a 
case of extrinsic purpose. 

Suppose that we let ¢ represent the state of having wings, and w the state 
of flying. Finally, let v represent the entire bird/bird-niche ecological system, 
including those aspects of the bird’s environment that make possible its success- 
ful reproduction. The fact that the wings serve the intrinsic purpose of flying 
can be expressed as: 


(s'|= (¢&v)|~ )) ~ (sz 4) 

The symbol ~» represents the relation of direct causal relevance (as defined 
in chapter 5). The state @ has the intrinsic purpose of w-ing in the token s, 
relative to background condition v, just in case the fact that some state-token 
s’ supports a connection between v and ¢ on the one hand, and # on the other, 
is causally relevant to s’s being @. In the case of a species v of flying birds, the 
fact that there is a causal connection between being winged and flying is part 
of the causal explanation of wingedness in the winged members of v. 

In the case of extrinsic purpose, we have instead: 


(s': ((@&v)|~ ¥)) > (8: v) 


In this case, take ¢ to be the presence of suitable seeds in the environment, 
and take w to be the fulfillment of the bird’s nutritional needs. In this case, the 
connection between v&@ and w causes instances of v, not of ¢. In other words, 
the fact that the seeds fulfill the bird’s needs explains why there are birds, not 
why there are seeds. Nonetheless, we can say objectively that, qua parts of the 
bird’s ecological niche, the seeds do have the extrinsic purpose of fulfilling the 
bird’s nutritional needs. 

Another mode of teleofunctionality is that of representational states, states 
whose function is to carry information of a certain kind. In the following chapter, 
I will define a notion of carrying information (with reference to relation R), 
which we can represent by the symbol =r. We can say that a particular 
pattern of retinal stimulation ¢ has the intrinsic function in s (relative to v) of 
carrying the information that w is realized in relation R to s just in case: 


(s': ((U& 9) rR o)) > (8:9) 


The pattern ¢ exists because it carries (in organisms of type v) the infor- 
mation 7. We might say that when a state occurs that has the function for an 
organism to carry potential information of a certain kind, then that information 
has become actual for that organism. 

It may seem odd to say that particular patterns of retinal stimulation have 
proper functions, as opposed to saying merely that the visual system as a whole 
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has a function. However, we can see the visual system as consisting of a set of 
capacities for patterns of stimulation. Where the capacity for a certain pattern 
has been selected for because that pattern carries particular information, we can 
say that the pattern itself has the function of carrying that information, since 
whatever causes the capacity of the system to undergo that pattern also causes 
individual occurrences of the pattern. 


7.5 Higher-Order Teleological Causation 


In part II (chapter 16), I will argue that the efficacy of mental properties de- 
pended on the possibility of higher-order functions. For example, consider the 
human faculty of inference (whether inductive or deductive). This faculty has 
the function of interacting with mental states on the basis of their content, a 
paradigmatically mental or psychological property. Suppose, for example, that 
mental type @ has the function of first recognizing the simultaneous presence 
of a belief in a conditional and a belief in the antecedent of the conditional, 
and then producing a new belief (by modus ponens) in the consequent of the 
conditional. Suppose we have three state-tokens, s, s2, and s3, where s; is an 
instance of the type ¢, the state whose function is the performance of modus 
ponens. Suppose that so is a state whose type is that of believing both a par- 
ticular conditional (p — q) and its antecedent, p. Let us call this type of mental 
state ~. Finally, let s3 be a state of believing q (call this type y), immediately 
posterior to the sum of s; and so. 

We may suppose that the functionality of type ¢ corresponds to a causal 
constraint of the form: 


((@&%)\~ x) 


Suppose that situation s supports this conditional and contains the sum of 
$, and sg. We may finally suppose that in the actual world w, s is actually a 
total cause of s3. Tokens s; and so are both indispensable parts of this cause, 
and so their mental properties are causally relevant to the outcome. In addition, 
the fact that mental properties ¢ and w are instantiated can be used in giving 
a causal explanation of the succeeding state. 

It may well be true that tokens s1, sg, and s3 also realize physical states ,11, 
ft2, and ug. It may also be the case that the instantiation of 41 necessitates the 
instantiation of ¢ by some super-token, and similarly for wz and w, and 3 and 
x. Finally, there may be a covering physical constraint of the form: 


((1 & 2) | bs) 


Suppose token s’ is a situation containing s, and sg and supporting this 
conditional. Then we can suppose that s’ is a total cause of s3. This seems to 
make the mental properties supported by s, and sg redundant or otiose. Such 
a conclusion, however, would be a mistake. It is true that s’ is a total cause of 
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s3 and that the mental types supported by s; and se are irrelevant to the s’-s3 
connection. However, it also remains true that s is a total cause of s3. Token 
s supports the psychological covering law but not the physical one. Hence, in 
the context of s, the physical properties of s; and sg are irrelevant, but their 
psychological properties are not. 


8 


The Universality 
of Causation 


Does every situation have a cause? On the one hand, there is a strong tempta- 
tion to say “Yes.” On the other hand, embracing the universality of causation in 
its strongest form leads to inconsistency, since we are forced to say that Reality 
(the sum of all actual situations) must itself have a cause, which must be an 
actual situation and thus part of Reality. However, a situation cannot be the 
effect of one of its parts. 

A natural response to this problem, common to Aristotle, Leibniz, and many 
others, is to limit the universality of causation to contingent situations. This 
cannot be quite right, as well-known objections by James Ross and William 
Rowe have shown. However, I think it is approximately correct. What is needed 
is to use the resources of mereology to define a category of “wholly contingent” 
situations. We can coherently suppose that all wholly contingent situations have 
causes. 

The universality of causation, if it is in fact true and knowable, has a number 
of very significant implications for the theory of epistemology of causal facts. As 
we shall see, abduction to unknown causes seems to depend on some assumption 
about the universal scope of causation. The induction of causal laws may have 
a similar dependence on this assumption. 

As I have discussed earlier, it is important to distinguish the thesis of the 
universality of causation from that of determinism. Determinism can be taken 
as the conjunction of some sort of principle of the universality of causation 
with the thesis that causes necessitate their effects. We have already seen a 
number of reasons for rejecting the necessitation model of causation. Reflection 
on the universality of causation gives us a further reason: if causes necessitate 
their effects, then it cannot be the case that all wholly contingent situations 
have causes. The determinist must come up with some alternative restriction 
of the universality of causation, perhaps to temporally bounded situations. In 
addition, the determinist needs to produce some independent motivation for 
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this restriction. 


8.1 A Modal Mereology of Situations 


My formal framework will be a modal logic supplemented by the Leéniewski- 
Goodman-Leonard calculus of individuals (“mereology”) (Leonard and Good- 
man (1940)). 

By way of modal logic, I need only the axioms of rules of T. I will assume 
a fixed domain of possible situations; hence, the logic will include the Barcan 
and converse Barcan axioms. 

I will use the two usual predicate symbols of mereology, E and ©, repre- 
senting part-of and overlap, respectively. I need three mereological axioms: 


Axiom 8.1 ely e¢Vz(zO2r—>z0Oy) 
Axiom 8.2 dx ¢(r) > AyVz (zOy & Ju (o(u) &KuO2z)). 
Axiom 8.3 c=yo(r#Ly&yLz) 


Axiom 8.1 defines the part-of relation in terms of overlap, and axiom 8.2 is 
an aggregation or fusion principle: if there are any facts of type ¢, then there 
is an aggregate or sum of all the ¢ facts. Axiom 8.3 guarantees that the part-of 
relation is reflexive and anti-symmetric. 

There are two principles linking the modal and mereological languages. Here 
I need to introduce a new predicate, A. Where 6 is a possible situation, Ab can 
be used to state that b actually obtains. 


Axiom 8.4 sly —O (Ay > Az). 
Axiom 8.5 O(Vy € FAy — A&F) 


Axiom 8.4 ensures that aggregation of situations is a form of conjunction: 
a whole necessitates all of its parts. Conversely, axiom 8.5 implies that the 
existence of all the members of a sum necessitates the existence of the sum 
itself, 

There is one special notion to be defined: that of being “wholly contingent,” 
represented by ‘V’. 


Definition 8.1 Va — (Az & Vy(y Cz > 70 Ay)) 


A wholly contingent situation is an actual situation none of whose parts 
are necessary. I am not assuming that there are any necessary situations: the 
existence of necessary truths does not entail the existence of necessary situations 
(since our logic lacks a comprehension principle). As we shall see, if there are 
any necessary situations, they are situations of a very special kind. 
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8.2 Principles of Causation 


The causal relation will be represented by a primitive binary operator, ‘>’. 
There are a number of logical properties of causation that can.be expressed, 
for instance, the transitivity and asymmetry of causation. I will, however, need 
only three facts about causation for the present purposes: 


Axiom 8.6 Veridicality: (x > y) + (Ax & Ay) 
Axiom 8.7 Separate Existence: (x > y) > 7(x Oy) 
Axiom 8.8 Universality: Vx (Vx — dy (yb z)) 


Axiom 8.6 stipulates that only actual situations can serve as causes or effects. 
Axiom 8.7 is intended to capture Hume’s insight that a cause and its effect 
must be “separate existences.” The language of mereology, when applied to 
facts, enables us to state Hume’s principle precisely: a cause must not overlap 
its effect. It is very important to bear in mind that axiom 8.7 does not require 
that a cause not overlap its effect in space or time: it is only mereological 
overlap (the having of a common part) that is ruled out. Axiom 8.8 expresses 
the universality of the causal relation: every wholly contingent fact has a cause. 
Axiom 8.8 does not entail determinism in any of its usual senses, since I have not 
stated that causes are sufficient conditions for their effects. I am not assuming 
that every event is necessitated by its causes; in fact, I believe that this is not 
typically the case. Causal laws are always exception-permitting or defeasible 
generalizations. It is quite possible for C to be in every sense the cause of E, 
even though it was possible for C to occur without being accompanied by E. 
(For this reason, this account of causation is compatible with, although it does 
not entail, indeterministic theories of human freedom.) 

The evidence for axiom 8.8 is essentially empirical. Every success of common 
sense and science in reconstructing the causal antecedents of particular events 
and classes of events provides confirmation of axiom 8.8. 


8.3 The Universality of Causation 
8.3.1 The Role of Defeasible Reasoning 


Even though we have excellent empirical evidence for the generalization that 
wholly contingent situations have causes, it is hard to see how any amount of 
data could settle conclusively the question of whether or not this generalization 
(axiom 8.8) admits of exceptions. This is a legitimate worry, but I would respond 
by insisting that, at the very least, our experience warrants adopting the causal 
principle as a default or defeasible rule. This means that, in the absence of 
evidence to the contrary, we may infer, about any particular wholly contingent 
situation, that it has a cause. 

Using the modal conditional O— to represent a kind of defeasible connection, 
we can express the weakened form of axiom 8.8 thus: 
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Axiom8.8’ Vz (Vz0— Jy (y> 2)) 


This version of axiom 8.8 can be read as: normally, a wholly contingent 
situation has a cause. This defeasible axiom 8.8’ will allow us to infer that 
any given wholly contingent fact has a cause unless some positive reason can 
be given for thinking that the fact in question is an exception to the rule, for 
example, by showing that the fact belongs to a category of things that typically 
does not have a cause. 


8.3.2 Is Universality Merely Heuristic? 


In his debate with Copleston, Russell insisted that there is a difference between 
claiming that scientists should always look for a cause and claiming that there 
is always a cause there to be found. Russell followed Kant’s suggestion that the 
universality of causation be seen as a canon or prescriptive rule for reason, and 
not as a description of mind-independent reality. The cosmological argument 
depends on using the principle of universality as a descriptive generalization. 

I have two principal responses. First, it is hard to see why the abundant 
success of empirical science in finding causes for contingent facts does not provide 
overwhelming empirical support for the generalization to all contingent facts. 
The category of wholly contingent facts is not an unnatural, gerrymandered 
kind like ‘grue’ or ‘bleen’. Are we to believe that it is merely a coincidence that 
time and time again we find causes for contingent facts? 

Second, the denial of the universality of causation as a descriptive gener- 
alization constitutes a very radical form of skepticism. All of our knowledge 
about the past, in history, law, and natural science, depends on our inferring 
causes of present facts (traces, memories, records). Without the conviction that 
all (or nearly all) of these have causes, all of our reconstructions of the past 
(and therefore nearly all of our knowledge of the present) would be ground- 
less. Moreover, our knowledge of the future and of the probable consequences 
of our actions depends on the assumption that the relevant future states will 
not occur uncaused. The price of denying this axiom is very steep: embracing 
a comprehensive Pyrrhonian skepticism. 


8.4 The Existence of an Uncaused First Cause 


Besides the logical principles presented above, the proof of the existence of a 
first cause depends on only one factual premise: that there exists a contingent 
situation. For example, suppose there are an odd number of molecules in my 
pencil at the present moment: surely there could have been an even number. A 
single contingent situation of this kind is all that I need, although I believe that 
nearly every fact with which we are acquainted is contingent. I would go so far 
as to say that every physical situation is contingent. 
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8.4.1 The Nature of Modality 


In saying that a situation is contingent, I am saying much more than merely 
that the proposition asserting its existence is neither logically true nor logically 
false. A contingent situation is one that is actual but could have been non- 
actual, where the relevant notion of possibility is that of broadly metaphysical 
possibility. Broadly metaphysical possibility is the fundamental form of possi- 
bility, of which all other kinds (physical, historical, legal, etc.) are qualifications 
or restrictions. 

Attempts since the days of logical positivism to reduce metaphysical possi- 
bility to logical consistency (or logical consistency with all definitional or “an- 
alytic” truths) have failed. First, it has proved impossible to specify the “an- 
alytic” truths without making reference to possibility and necessity. Second, 
nothing is gained in clarity unless we insist on using first-order logic, which, 
as John Etchemendy (1990) has argued, is an implausible construal of logical 
consistency. Finally, the attempt to avoid the supposed “mysteries” of meta- 
physical possibility in this way leads to the much more serious difficulties of 
set-theoretic platonism, with the attendant mysteries of how these transcendent 
mathematical entities connect to the rest of reality and, most crucially, of how 
we can obtain reliable knowledge of them. Recent efforts at making sense of 
mathematical reality make use of the notion of metaphysical modality (as in 
the “possible structures” of Hellman (1989)), indicating that the proper order 
of explanation stars with modality, not with mathematical entities. 

If we deny that there are any contingent situations, then we must conclude 
that we live in a world in which all three modalities — possibility, actuality, 
and necessity — collapse together. This is tantamount to denying that these 
modalities can do any interesting work. Such a denial runs athwart the growing 
body of philosophical work in which modality plays a central role. 


8.4.2 A Sketch of the Proof 


Lemma 8.1 All the parts of a necessary situation are themselves necessary. 
Proof: By axiom 8.4 and the K axiom of modal logic. 
Lemma 8.2 Every contingent situation has a wholly contingent part. 


Proof: Let a be a contingent situation. If a is wholly contingent, we are 
through, since a is a part of itself. Otherwise, a has a necessary part. By axiom 
8.2, there exists a situation @ (x C a & OAz) that consists of the aggregate 
of all the necessary parts of a. Since a is contingent, a itself is not a part of 
% (x Ca & OAz), since if it were, then, by axiom 8.3, a would be identical 
to £ (c LC a & OAz), and, by axiom 8.5, a would exist necessarily, being a 
sum of necessary parts. By axiom 8.1, there is a b that overlaps a but not 

(« L a & OAz), hence there is a part of a, say c, that is not a part of 
(ele & OAz). 


£ 
z 
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We can show that c is wholly contingent. Suppose that d is a part of ec. 
Then d is part of a but d does not overlap # (x C a & OAz). Hence, d is not 
necessary. Since d was an arbitrary part of c, c is wholly contingent. 


Definition 8.2 Let C be the aggregate of all wholly contingent situations. 


By axiom 8.2, it follows that if there are any wholly contingent facts, then 
any fact overlaps C if and only if that fact overlaps some wholly contingent 
situation. 


dr Ve — Vy (yOC & az (Vz &yO2)) 


Lemma 8.3 If there are any contingent situations, C is a wholly contingent 
situation. 


Proof: Suppose that there is at least one contingent situation. Then there 
is also a wholly contingent part, by the preceding lemma. To show that C is 
wholly contingent, we must show that every part of C is contingent. Let a be a 
part of C. Since a is a part of C, a overlaps C, by axioms 8.1 and 8.3. Hence, 
a overlaps some wholly contingent b (by the definition of C). It is a theorem of 
mereology that two facts that overlap have a common part. Hence, some d is 
part of both a and of b. Since b is wholly contingent, d is contingent. By lemma 
8.1, if @ were necessary, d would be necessary. Consequently, a is contingent. 
Therefore, since a was an arbitrary part of C’, C is wholly contingent. 


Lemma 8.4 [f there are any contingent situations, C has a cause. 


Proof: An immediate consequence of lemma 8.3 and axiom 8.8, the Univer- 
sality of Causation. 


Lemma 8.5 Every contingent situation overlaps C. 


Proof: Let a be a contingent situation. By lemma 8.2, a has a wholly 
contingent part, say b. By axiom 8.2 and the definition of C, C' and b overlap. 


Theorem 8.1 If there are any contingent situation, then C has a cause that is 
a necessary fact. 


Proof: By lemma 8.4, C has a cause. By axiom 8.7 (Separate Existence), this 
cause does not overlap C. By lemma 8.5, every contingent situation overlaps 
C. By axiom 8.6 (Veridicality), the cause of C' is actual. Hence, the cause of C 
must be a necessary situation. 

Since we know that there is at least one contingent situation, we can identify 
C with the cosmos, and use theorem 8.1 to conclude that the cosmos has a cause 
that is a necessary fact, a first cause. It is legitimate to call this cause a “first 
cause” if we assume (as seems plausible) that all effects are contingent.’ 


lFor a discussion of some of the theological implications of this result, see my article “A 
New Look at the Cosmological Argument” (Koons (1997)). 


Universality of Causation 113 


8.5 The Well-Foundedness of Causation 


Is the causation relation well founded? Are infinite causal regresses impossible? 
Beginning with Plato (in Book X of The Laws), many philosophers have thought 
so. However, many others, especially in the twentieth century, have expressed 
doubts on this point. 

It is a corollary of my version of the cosmological argument that the causal 
relation is well founded. Suppose for contradiction that we have an infinite 
causal regress: ... > Sn) >... 8, > Sg. Let us call the sum of the regress 5... 
Only wholly contingent tokens can be caused, so each of the members of the 
series is wholly contingent. Consequently, s., is wholly contingent. By axiom 
8.8, Soo has a cause, $5041. 

However, $9.41 cannot be an immediate cause of any of the members s, 
of the series, since 84941 is screened off from sy by Sn41. Suppose, for con- 
tradiction, that s..4; were a cause of s,. Then, s,41; would be preempted 
from causing s,, since $.,41 is causally prior to $,41. This contradicts our 
assumption that s,+41 is a genuine cause of s,. Therefore, s..41 cannot be the 
immediate cause of any member of the series. 

Since $4941 is not an immediate cause of any of the members of the series, 
it cannot be a mediate cause of any of them either, since mediate causation is 
simply the transitive closure of immediate causation. 

So, $0041 does not cause any of the members of the series, and therefore, it 
does not cause the sum of the series, s.., contrary to our original assumption. 

In many cases, the impossibility of an infinite regress has been used as a 
premise in the cosmological argument. I think, however, that it is more illumi- 
nating to think of it as a corollary. 


8.6 Objections 


8.6.1 Isn’t Causation Valid Only for the Phenomenal 
World? 


In the first Critique, Kant argues that causation pertains only to the apparent 
or “phenomenal” world, not to the real or “noumenal” world. His argument 
depends on assuming that the fundamental causal principles are known prior to 
experience, and that nothing substantial or material about the real world can 
be known by us prior to experience. Kant’s objection is relevant only to a priori 
arguments for God’s existence, like those of Scotus or Leibniz. It is not relevant 
to an argument like mine that rigorously appeals only to empirical, a posteriori 
arguments. I am not claiming that the axioms of causality 1 am appealing to 
are known by us prior to their application to the world of experience. Instead, 
I appeal to our success in finding causal explanations as empirical evidence for 
these generalizations. 
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8.6.2 What about Quantum Mechanics? 


Quantum mechanics is sometimes taken to provide abundant counter-evidence 
to the universality of causation. Quantum mechanics raises two problems for 
our understanding of causality: the indeterminism of wave collapse (under the 
Copenhagen interpretation), and the Bell inequality theorems. 

The indeterminism of quantum transitions during observation does not con- 
tradict axiom 8.8. I have not assumed that causes necessitate their effects: 
in fact, I strongly suspect that such an assumption -is incoherent (if “necessi- 
tate” is understood in a strong sense). According to the Copenhagen version of 
quantum mechanics, every transition of a system has causal antecedents: the 
preceding quantum wave state, in the case of Schrédinger evolution, or the pre- 
ceding quantum wave state plus the observation, in the case of wave packet 
collapse. 

The Bell inequalities demonstrate that the data described by quantum me- 
chanics forces us to reject one of the following three principles: 


e Causal influences never travel backward in time. 
e Causal influences never travel faster than the velocity of light. 


e Every reliable (projectible) correlation has a causal explanation. 


In discussions of the Bell inequalities, the third principle is sometimes la- 
beled a law of “causality.” It is, however, much stronger than my axiom 8.8. In 
this chapter, I have not assumed that (as the third principle implies) a cause al- 
ways “screens off” (in Reichenbach’s sense) its effects from non-posterior states, 
although I will make use of this assumption in appendix B. 

The Bell inequalities are merely another demonstration of the impossibility 
of reducing causation to some sort of statistical relationship. They raise no dif- 
ficulties for a causal realist such as myself. In my opinion, the most reasonable 
response to the Bell inequalities would be to restrict one or more of the three 
principles above to macroscopic (large-scale or classical) phenomena and to re- 
state them as defeasible (exception-permitting) rules. I would favor restricting 
the second principle, applying it only to direct, macro-to-macro interactions, 
interactions between classical systems. Where causal influences between clas- 
sical systems are mediated by quantum phenomena (which, on my view, have 
no intrinsic position or velocity), then exceptions to the second principle can 
occur. These exceptions do not, however, permit the exchange of information 
at superluminal velocities. 


8.6.3 Doesn’t the Argument to a First Cause Assume 
the Impossibility of an Infinite Regress? 
Leibniz was the first to realize that the cosmological argument does not depend 


on any assumption about the impossibility of infinite regresses. Even if there 
are infinite regresses of causes within the totality of contingent facts, the totality 
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itself must have a cause that is outside it and, hence, a cause that is necessary. 
The crucial assumption is axiom 8.2, the assumption that any non-empty set 
of situations can be aggregated into a single situation. This corresponds to 
the pre-modern denial of infinite regress, since it in effect denies that any such 
totality is what Cantor termed an “absolute” or improper totality (like the set 
of all sets, or the set of ordinal numbers). 

There is little if any reason to think that there is anything improper about 
the totality of all wholly contingent situations. We are talking only about on- 
tologically basic situations, not about mathematical or semantical truths that 
supervene upon them. I am simply aggregating concrete particulars, and I am 
not running afoul of Russell’s vicious circle principle in the process. There is no 
reason to postulate any facts that somehow involve or presuppose the totality 
of all situations, or of all contingent situations. 


8.6.4 Doesn’t the Argument Commit the Fallacy 
of Composition? 


Russell accused Copleston of committing the fallacy of composition, arguing 
that because each of the parts of the world is caused, the whole must be caused. 
The cosmological argument includes no such error: it is demonstrated that the 
cosmos is itself a wholly contingent situation, and for that reason must have a 
cause. 


8.6.5 Isn’t Necessary Existence an Impossibility? 


A number of twentieth-century philosophers follow Hume in holding that only 
logical truths can be necessary, that the very notion of a necessary existence is 
incoherent. 

Two replies. First, I have not assumed the existence of a necessary situation: 
this was the conclusion, not a premise, of the argument. Thus, this so-called 
objection simply fails to engage the argument. The objector is content merely 
to deny the conclusion without bothering with the premises or the reasoning. 

Second, the Humean principle being relied upon is self-defeating. Is it sup- 
posed to be true by definition that only logical or definitory truths are neces- 
sary? Surely in saying this, Hume, Russell, and others intended to be saying 
something informative. How could such a principle be contingent? What sort 
of contingent facts about the actual world make it the case that there are no 
non-logical necessities? What empirical justification have the anti-essentialists 
provided for their claim? 

In response, the objector must simply deny that he can make any sense 
of this notion of modality, except insofar as it is replaced by the clear and 
well-behaved notion of logical consistency. This sweeping denial of modality is 
simply obscurantist, undermining fruitful philosophical research into the nature 
of natural law, epistemology, decision, action and responsibility, and a host of 
other applications. 
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8.6.6 Don’t Contingent Facts Typically Have 
Contingent Causes? 


This is probably the most promising line of rebuttal to the cosmological argu- 
ment. It is an instance of a wider strategy: focus on some unique feature of the 
first cause, and point out that the cause of the world’s having that feature is 
an exception to some well-established generalization. Indeed, for the most part, 
contingent situations do have contingent causes. They also have causes with 
finite attributes and causes that can be located in space and time, unlike the 
hypothesized first cause. Once we have established that the cosmos is relevantly 
unusual, we seem to be faced with two equally unattractive options: supposing 
that the cosmos has only a very unusual kind of cause, or supposing that it has 
no cause at all. Thus, we end in a stalemate. 

The defender of the cosmological argument must respond with substantial 
reasons for thinking that, although the first cause is unique in a number of 
respects, each of these unique features can be adequately explained by extrap- 
olating from tendencies already observable in ordinary cases of causation. For 
instance, I would conjecture that, in some precise sense, a cause is always more 
nearly necessary or less profoundly contingent than its effect. 

One very simple definition of relative necessity would be the following: 


a is more nearly necessary than b ap Vx C b(O(Ar - Aa) & O(Aa&-Az)| 


In other words, a situation a is more nearly necessary than situation b just 
in case a holds in every world in which any part of b holds, but a could exist in 
the absence of any part of b. 

That the causal antecedents of a situation-token are more nearly necessary 
than the token itself follows from the identity conditions of situation-tokens. 
The causes of a token are essential to its identity: had the very same truth 
been verified by a situation caused in a different way, we would not have had 
the same situation as verifier. The corresponding thesis involving effects is not 
plausible: a situation’s identity does not include the eventuality of all its effects. 
The contingency of the evolution of the world depends on this asymmetry: a 
situation’s holding necessitates the holding of its causes, but not of its effects. 

This principle (an effect necessitates the existence of its causes) does not 
imply that the content of an effect necessitates the content of its causes. For 
example, the situation of Caesar’s death could not have existed had not all of its 
causes, including Brutus’s knife thrust, existed. This of course does not mean 
that Caesar wouldn’t have died unless Brutus and the other senators had killed 
him. The truth ‘Caesar died’ would have been verified by a different situation 
in all of those worlds in which Brutus does not help in inflicting the fatal set of 
wounds. The situation that actually verifies the truth ‘Caesar died’ would not 
have existed had any of its causes failed to exist. 

There are several additional reasons (besides the one involving the identity 
conditions of situations) for thinking that causes are more necessary than their 
effects. First, there is the authority of Aristotle and the Aristotelian tradition. 
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Second, it is clear that we need some account of causal priority that explains 
the transitivity and asymmetry of this relation. An account of causal priority 
in terms of relative necessity nicely satisfies this desideratum. 

Third, this account enables us to specify exhaustively the “potential causes” 
of a given situation: a is a potential cause of b if and only if a is more necessary 
(less contingent) than b. Such a specification is necessary if we are to account 
for the statistical properties of causal connections, the so-called Markovian prin- 
ciples developed by Salmon (1984) and Suppes (1984) and studied recently by 
Pearl and Verma (1991) and by Spirtes et al. (1993). I use these Markovian 
principles in developing a causal calculus in appendix B. Markov locality en- 
tails that the causal antecedents of an event “screen off” the probability of that 
event from the probability of any non-consequent event-token. If we assume 
that the probability of every actual event-token is screened off in this way by 
its actual causes, then we are implicitly assuming that the causal antecedents 
of any actual token are necessary to its identity, that there are no non-actual or 
counterfactual causes of actual tokens. 

Finally, the relative necessity of causally antecedent tokens gives us an ex- 
planation of the asymmetry of past and future. In some sense, given the present, 
the past is fixed in a way that the future is not. This “fixity” of the past can 
best be understood as the relative necessity of past event-tokens, given the to- 
ken event corresponding to the present. It is not that the type of the present 
moment necessitates the types of past moments, since there could certainly have 
been many different histories leading up to an event qualitatively identical to 
the-world-at-the-present-moment. Instead, the event-token that is present ne- 
cessitates the event-tokens making up the past, but it leaves open a number of 
different sequences of future event-tokens. Since past tokens are causally an- 
tecedent to the present, we have another (and I think conclusive) reason for 
accepting the thesis of the relative necessity of causally antecedent tokens. This 
thesis is implicit in all “branching-future” models of temporal logic. (See section 
10.2 for further evidence on this point.) 

However relative contingency is defined, it is clear that the cosmos is a 
situation of absolutely minimal contingency. If situation a contains situation b 
as a part, then b is no less contingent (no more necessary) than a, since a could 
not exist if b did not exist. Since the cosmos contains every wholly contingent 
situation as a part, no wholly contingent situation can be less contingent than 
the cosmos. Since the cosmos is a situation of minimal contingency, it is not 
surprising that it should have no contingent cause, but it would still be very 
surprising if it had no cause at all. 

These considerations lead to a new version of the critical axiom 8.8, the 
axiom of causality. 

Axiom 8.8” Vzr(AxrO— dy(yis more nearly necessary thanz & y > z)) 

On the basis of induction, we can confirm that, at every degree of neces- 
sity (short of absolute necessity), every token is caused by some token more 
necessary than it. As we successfully build scientific models that stretch across 
astronomical and geological time, we confirm that situation-tokens across a wide 
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swath of degrees of necessity have causes that are strictly more nearly necessary 
than themselves. Axiom 8.8” is the defeasible generalization of this pattern. 
Axiom 8.8” states that we may reasonably infer, about any token at any degree 
of necessity, that it has a causal antecedent which is more nearly necessary than 
it. When we try to apply axiom 8.8” to a necessary fact (or any fact that is 
not wholly contingent), we find that the defeasible conclusion is blocked, since 
there is no fact more nearly necessary than an absolutely necessary fact. When 
we apply axiom 8.8” to the cosmos, or to any other minimally contingent fact, 
we succeed in drawing the defeasible conclusion, and in addition, we have an 
explanation as to why the cause of the cosmos is necessary. 

In fact, axiom 8.8” does not depend on the strong assumption that every 
token necessitates every one of its causal antecedents. It is sufficient to make the 
much weaker assumption, that every token necessitates at least one of its causal 
antecedents. The cosmos must have a causal antecedent that it necessitates, 
and this necessitated cause must be absolutely necessary. 


8.6.7 Where Did the First Cause Come From? 


If we’re right in thinking that causes must be strictly more nearly necessary than 
their effects, it follows that necessary situations cannot be caused (at least, not 
in the ordinary sense). 

Another reason for thinking that necessary situations cannot be effects is 
this: we know that the totality of all situations cannot be caused (since there 
is no situation that does not overlap it), and the best explanation of this situa- 
tion is that this totality contains necessary situations, and necessary situations 
cannot be caused. 


8.6.8 The Ross Objection: Did the First Cause 
Cause That It Caused the World? 


James Ross (Ross, 1969, pp. 295-304), has argued that the principle of sufficient 
reason can be demonstrated to be false. His objection can be adapted into an 
objection to my axiom 8.8 (the Universality of Causation) as follows. Consider 
the situation that the first cause caused the cosmos. Call this situation C™. 
C™ is clearly a contingent situation, since if it were necessary, the cosmos itself 
would be necessary (by Axiom 8.6, veridicality). If C* is also wholly contingent, 
then it must be a part of the cosmos, and the first cause must cause C”, i.e., 
the first cause must cause the situation that it causes the cosmos. The same 
argument can be repeated, showing that the first cause must cause that it causes 
that it causes the cosmos, ad infinitum. This appears to be a vicious infinite 
regress. 

The best answer to this objection is to point out that there is no reason to 
think that C* is wholly contingent. The situation that the first cause causes 
the cosmos would appear to be composed of two situations: namely, the first 
cause on the one hand and the cosmos on the other. The truth that the first 
caused the second does not represent a third situation in addition to the first 
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two. Instead, such statements about single-case causal connections supervene 
upon the cause, the effect, and certain non-situational truths about the modal 
relationship between the cause and the effect. Therefore, the wholly contingent 
part of C* is simply the cosmos itself, and we are forced only to reaffirm that 
the first cause does cause the cosmos. 

This response entails that there are no situations, over and above situa- 
tions about modality and other non-causal matters, corresponding to single-case 
causal nexus. That is, we are assuming that causal truths are supervenient on 
modal and other non-causal truths (including truths about objective chance or 
propensity, and about powers and liabilities). Causal connections between sit- 
uations in a world are to be explained entirely in terms of what has happened 
in that world, and what might or probably would happen in it and alternative 
worlds. This sort of modest ontological reduction is quite attractive, since the 
alternative is to posit causal nexus as brute situations, without any logical re- 
lationship to predictability or to statistical regularities. At the same time, this 
sort of modest reduction does not entail the eliminability of causal discourse, nor 
does it obviate in any way the necessity of positing situations as an ontological 
category. Causation is a relation between situations, not any kind of proposi- 
tional operator, but any particular causal nexus between situations consists of 
some aggregation of other modal, stochastic, and historical situations. 


8.6.9 William Rowe’s Objection 


William Rowe (Rowe, 1975, pp. 108-110), has proposed a variant of Ross’s 
objection to the cosmological argument. Rowe asks us to consider the situation 
a that corresponds to the true proposition: there are contingent (positive) situ- 
ations. Most defenders of the cosmological argument will accept that a is itself 
contingent. Therefore, the first cause must cause a. However, the situation that 
the first cause has caused a is itself a contingent situation, so the first cause 
would have to cause the situation that it caused a, and so on, ad infinitum. 

The proper response to this objection is only slightly different from the 
response to the last objection. The proposition that there are contingent situa- 
tions does not correspond to a single situation. Situations are not closed under 
existential generalization, as propositions are. From the existence of a situation 
that n has F’, it does not follow that there is a distinct situation that some- 
thing has F. Consequently, the situation that makes Rowe’s a true is simply 
the cosmos itself, and no infinite regress can be generated. 

This is not simply an ad hoc response, since there are independent grounds 
for denying the existence of a special category of existential situations. Causa- 
tion is transparent: that is, if the situation that there is an F’ caused a, then 
there is some n such that the situation that n is F caused a. Similarly, if the 
proposition that there is an F’ has been made true by some situation a, then 
there is some instance of this generalization that has been made true by a. 
Thus, in neither case is there any reason to posit a special category of situation 
corresponding to the existential quantifier. 


9 


A Theory of Information 
and Misinformation 


9.1 Introduction 


Attempts to explicate the phenomenon of representation naturalistically, say, 
in terms of causal connection, often founder on the problem of explaining the 
possibility of error. Suppose, for example, that we attempt to explain represen- 
tation along these lines: fact (s,a) represents fact (s’,7) iff (s’,7) is a causally 
necessary condition of (s,a). Such an account, of course, leaves no room for the 
possibility of misrepresentation, if ‘necessary condition’ is interpreted strictly. 
Whenever an actual (s,o) represents a fact (s’,7), the state (s’, 7) will be actual 
and (s,a) will not be in any way a misrepresentation of the world. 

A common strategy for solving this problem is to distinguish between two 
types of situation: type 1 and type 2.1 We can then identify the content of a 
representation with that fact which is causally necessitated by the form of the 
representation in type 1 situations. A representation in a type 2 situation can 
then misrepresent the world, since its content is determined with reference to 
a counterfactual situation: what would be causally necessitated by the form of 
the representation were it to be located in a situation of type 1 instead of type 
2? 

The distinction between type 1 and type 2 situations is typically made in 
terms of the historical antecedents of the representational form. For example, in 
Dretske’s account of representation (Dretske (1981)), type 1 situations are those 
situations that occur during the training period in which the meanings of the 
representational forms are impressed upon the individual subject. Similarly, in 
Millikan’s account (Millikan (1984)), type 1 situations are those situations that 
actually occurred in the evolutionary history of the representational system in 
question. 


1See (Fodor, 1990, p, 60). 
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The strategy of appealing to historical antecedents leads to a number of 
serious difficulties, as I will argue in section 9.2. Firstly, the historical strat- 
egy tends to attribute contents that are far too weak, since misrepresentations 
do in fact occur in type 1 situations, and these are misdescribed as veridical 
whenever the historical strategy is followed. Secondly, the historical strategy 
makes content far too sensitive to irrelevant accidents of history. Finally, this 
strategy would force us to make facts about the remote past relevant to the best 
theoretical account of the present. 

In section 9.3, I develop two accounts of the nature of information and of 
the possibility of error that avoids the historical strategy, the type 1/type 2 
distinction, and the concomitant difficulties. The first account relies exclusively 
on probabilistic connections between states, interprets information in terms of 
probabilistic necessitation, and leaves room for the possibility of error by failing 
to make the erroneous inference (made, surprisingly enough, by both Dretske 
and Fodor) from an event’s having probability zero (or infinitely close to zero) 
to that event’s being absolutely impossible. The second account uses the idea 
of conditional functions to explain error: a representation is erroneous when 
it has the function (conditional on q) of carrying the information that p, and 
condition q fails to hold. 


9.2 The Historical (Retrospective) Strategy 


The simplest information-based theory of representation would go something 
like this: a representation type o represents the actuality of some state of affairs 
7 just in case it is causally impossible that o be actual without 7’s being actual. 
This means that either 7 is a causally necessary condition for o or o is a causally 
sufficient condition for +r. Unfortunately, this simple theory attributes content 
to representation types that is far too weak, so weak that error is impossible. 
On this account, if o is actual, and o represents 7, then 7 must also be actual. 
One way to narrow the content of a representation-type and thereby to ex- 
plain the possibility of error is to appeal not only to the present causal properties 
of the representation-type but also to facts about the actual history of the type 
(even its remote history). These facts may be facts about the previous history 
of the individual symbol user — as in Dretske’s theory (Dretske (1981)) — or 
about the history of the representational practice to which the type belongs — 
as in Millikan’s account (Millikan (1984)). A simple Dretske-like theory might 
take the following form: representation-type o represents 7 for subject A iff for 
every situation s belonging to the training period (during which A learned the 
meaning of a), T was causally necessary in the circumstances (in s) for 0. We 
could say that a state of affairs 7 is causally necessary in the circumstances of 
s for o iff there is a state of affairs v such that both o and v are actual in s and 
T is causally necessary for the joint occurrence of o and v. Misrepresentation is 
possible for any representation occurring outside of the training period, because 
an event-token of o occurring in a situation s outside the training period might 
represent 7 even though 7 is not necessary for o in the circumstances of s. 
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This sort of account has the paradoxical result that the longer and more 
varied the training period, the weaker the content of the representation-type. If, 
for example, we extended the scope of the relevant history to include the whole 
history of the representational practice (in the case of natural representations, 
this would mean the entire evolutionary history of the species), the resulting 
content would be so weak as to render error virtually impossible. 

An alternative but very simple informational account would stipulate that o 
represents 7 just in case the occurrence of o increases the objective probability 
of 7. Let’s say that o probabilifies + just in case the objective conditional 
probability of 7 on o is greater than that of 7 on the negation of o. We could 
say that o represents rT just in case o probabilifies r. This account has a defect 
that is exactly opposite to the defect we encountered in the simple causal- 
necessitation model. Instead of making error impossible, this account makes 
error absolutely ubiquitous. Every representation represents innumerably many 
possible states of affairs, all but a vanishingly small proportion of which are 
nonexistent. 

Millikan starts with this probabilizing model and solves the ubiquity of error 
problem by adopting a version of the historical strategy. On a simple Millikan- 
like account, we could stipulate that o represents 7 iff o probabilifies 7, and the 
fact that o probabilifies 7 has in reality contributed causally to the perpetuation 
of some reproductive family to which o belongs. The longer and more varied 
is the relevant evolutionary history, the narrower are the contents ascribed to 
the representation and the more frequent are the errors and misrepresentations. 
Indeed, as Millikan recognizes, there will be many representational forms that 
will be erroneous on nearly every occasion (Millikan, 1984, p. 34). For example, 
suppose that some pattern of auditory stimulation increases the probability of 
the presence of a predator and that this pattern has triggered a flight response 
in the past, contributing thereby to the perpetuation of the species. Then, 
on Millikan’s account, the auditory pattern represents “Predator near!” even 
though on nearly all occasions, the pattern is caused by the wind’s rustling of 
leaves. In fact, Millikan’s account cannot provide a basis for ascribing proba- 
bilistic content. For example, we could not, on her account, distinguish between 
signals that mean “There’s a slight chance of a predator near” from those that 
mean “More likely than not there’s a predator near” or “Without a doubt a 
predator is near.” All of these signals would simply represent “Predator near” 
without qualification. 

There are a number of other difficulties that could be raised concerning the 
details of Dretske’s theory or of Millikan’s, but here I would like to concentrate 
on some problems that are endemic to the historical strategy itself. Firstly, 
reliance on the historical strategy causes deviant cases in the past to influence 
the content of representations. For instance, it is quite common for training 
periods to include some cases in which the representation is wrongly but plau- 
sibly applied. I could teach a child the true meaning of ‘bird’ by means of 
cleverly constructed mechanical models, even though every attribution of the 
term in the training period was false. Similarly, in the evolutionary history of 
any representational system, there will be events in which misrepresentations 
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accidentally contributed to the survival of the system. 

Secondly, the historical strategy makes content too sensitive to accidental 
features of history. For the sake of illustration, consider the following version of 
the Twin Earth thought-experiment. Suppose that on Twin Earth, both H,O 
and XY Z occur in equal abundance, and in close proximity to one another: here 
an H>O lake, there an XYZ river, and so on. Suppose further that, simply as 
a matter of pure coincidence, the inhabitants of Twin Earth have encountered 
only H2O and have applied to it the term ‘water’. Applying the historical 
strategy means interpreting this symbol as designating only H2O, despite the 
fact that Twin Earthers are, in the future, just as likely to encounter XYZ as 
H2O and are completely unable to discern any difference between the two. 

Thirdly, the historical strategy makes facts about the remote past directly 
relevant to the ascription of content to present-day representations. Content 
ascription should enable us to understand and explain the behavior of rational 
agents; information about the remote past of such agents cannot be of any 
immediate significance for this task, unless we are to believe in something like 
action at a temporal distance. 


9.3. Two New Strategies 
9.3.1 Fallible Information 


Information is somehow tied to objective probabilistic relevance. In explicating 
this tie, we seem to be faced with a dilemma. If we insist that whenever a 
fact o carries the information 7, the objective conditional probability of 7 on o 
be one, then we make o sufficient for 7, thereby eliminating any possibility of 
error. Alternatively, if we require only that the conditional probability of 7 on 
o be very high (though not necessarily equal to one, or that the probability of 
7 on a be greater than that of 7 on —0, then we run afoul of a very important 
principle of information — what Dretske calls the Xerox principle (Dretske, 
1981, pp. 57-58). Dretske’s Xerox principle is simply the requirement that 
the carriage of information is transitive: if o carries T, and 7 carries uv, then o 
carries uv. Obviously, if we set some finite distance « from one as the threshold 
on conditional probability for the carriage of information, then this carriage will 
not be transitive. 

The dilemma stands only if one assumes — as Dretske (Dretske, 1981, p. 
245) explicitly does — that it is impossible that a state have probability one 
and fail to be actual. This assumption is false for standard interpretations of the 
probability calculus, in which events of measure zero are quite possible. I agree 
with Dretske that this assumption is a useful one. However, one can accept this 
assumption and still avoid the dilemma, by using a non-standard probability 
theory, one permitting hyperreal, i.e., infinitesimal, quantities. 

An abstract or generic informational link involves three entities: a situation- 
type that characterizes the carrier of the information, a binary relation on tokens 
that constitutes the “direction” of information flow, and a situation-type that 
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characterizes the target of the information link (that with which the link is 
concerned). 


Definition 9.1 (Generic Information Link) 


(¢R>R v) =aef 
Va((Az & (a= $)0 dy(Ay & (y|= p) & Ray)) & 
Vy((Ay & (y= p)O- Azr(Az & (2|= ¢) & Rry)) 


In this definition, I am assuming that R is a relation (like the natural connec- 
tion relation >> or its converse) that is necessarily uniquely valued (functional 
on its domain): if Rss’ and Rss’, then necessarily, s’ = s”. If this is not the 
case, then we must add a clause stating that y is the only token R-related to x 
to the consequent of the two conditionals. The definition guarantees that the 
probability of a situation of type 7 that is R-related to a given situation of type 
¢ is infinitely close to 1, and, in addition, that the probability of the existence 
of a situation of type ¢, given the existence of one of type 7, is finite. This 
second clause is needed in order to support the validity of the Xerox principle, 
in the following form: 

Xerox Principle 


(6 =>R Vv), (W >R xX) = (6 > Ror’ X) 


We can also describe the flow of information from one token to another. The 
definition of information flow involves five parameters: two tokens, two types, 
and the linking relation. 


Definition 9.2 (Token-to-Token Information Link) 


(s1:$) &>R (82: P) =der 
(¢ =>pR ¥) & Rs182 & (si|= ¢) 


Misinformation is quite possible, since we know only that the conditional 
probability of sq’s being w is infinitely close to 1. 

The principle of probabilistic locality entails that any information link be- 
tween mereologically disjoint tokens is causally mediated: either the carrier is 
part of a cause of the target, or vice versa, or there is a common cause of both. 

This account has the counterintuitive result that misinformation or natural 
error can be expected to occur with only an infinitesimal frequency. Two things 
can be said in response. First, since information is ubiquitous, the fact that 
the limiting relative frequency of misinformation is infinitesimal does not entail 
that the absolute frequency of error is low. Moreover, when misinformation 
is detected, this fact is especially vivid and salient, while the background of 
accurate information is taken for granted and largely unnoticed. Second, the 
usefulness of my account does not depend on taking the requirement of an in- 
finitesimal relative frequency of error literally. Presumably, misinformation is 
exceptional, occurring with a very low relative frequency. At some point, very 
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low finite probabilities are treated, for all practical purposes, as though they 
were infinitesimal. There are fairly obvious computational advantages to work- 
ing with qualitative differences, represented formally as infinite ratios, instead 
of working exclusively with quantitative differences. What I am offering is a 
formal model of how we reason commonsensically about information. If the 
account faithfully reproduces the crucial features of our commonsense practice, 
then the question of its literal truth is of little or no importance. In actual 
practice, we apply descriptions like ‘misinformation’ or ‘error’ to cases of which 
the descriptions are not literally true, as, for example, we apply descriptions 
like ‘flat’ to surfaces that are not literally flat but are close enough to flatness 
for practical purposes. 

However, if this objection is taken to be decisive against the proposal I have 
made, or at least against its adequacy as an account of all misrepresentation, I 
have an alternative account, one in terms of conditional functions. I take that 
account up in the next subsection. 


9.3.2 Conditional Functions 


Let us return for the moment to a simple necessitation model of information, 
like that of Dretske. 


(¢ =r Y) =dep OV2(Az & (|= ) > Jy(Ay & (y= p) & eRy)) 


Many animals have what are known as “flight mechanisms.” These flight 
mechanisms are perceptual sensitivities to the environment that trigger the re- 
action of fleeing. They are adaptive because they often enable the animal to 
escape predatory animals. However, the perceptual sensitivity does not carry 
the information that a predator is present, either in the strong, Dretskian sense, 
or in the weaker sense developed by means of hyperfinite probabilities in the 
last section. The probability that a predator is actually present may be quite 
low, even less than 1%. Nonetheless, we would like to say that the perceptual 
state in some sense represents the possible presence of a predator. 

One solution to this problem is to build probability into the content of the 
representation. The content of the perception is something like: there is a 1% 
probability that a predator is near, and located roughly to the right. One difficulty 
to this solution is that it seems to attribute sophisticated probabilistic concepts 
to quite primitive animals. However, this is not quite right, since there is no 
reason to suppose that the content of the representation is articulated in such 
a way that there is a component corresponding to the 1% probability concept. 

Nonetheless, it would be preferable to find a model of representation that 
did not require such enriching of the content. A better solution is to make use 
of a concept of conditional function. On this model, a representation of the 
content p is not simply a state with the function of carrying the information 
that p, but rather a state with the conditional function of carrying the informa- 
tion that p in circumstances C. In the case of the flight mechanism, we could 
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say that the associated perceptual states have the function of carrying informa- 
tion about the approximate location of a predator in circumstances in which a 
predator is actually present and has actually made some significant noise of the 
appropriate kind. It is the animal’s perceptual state plus the teleologically rele- 
vant circumstances that carry the information that a predator has a particular 
location. When the perceptual state occurs but the relevant circumstances are 
not actualized, we can say that the state constitutes a misrepresentation. 

This notion of conditional function is closely related to the idea of conditional 
constraints developed by Barwise and Perry.? We can say that a perceptual state 
@ of a token s has the conditional function (conditional on circumstances x) of 
carrying the information (relative to R) that a token of type w is actual if and 
only if the fact that the conjunction ¢& x carries the information (relative to 
R) that w is realized is causally relevant to s’s being of type ¢. 


Definition 9.3 (Conditional Representation-Function) 


T(s, d, R, y, x) def 
(s': ((P& x) Hr Y)) > (8: 4) 

Even if the model of conditional functions is needed to explain the possi- 
bility of many forms of error, it is still useful to employ the hyperfinite model 
of information developed in the preceding section. For one thing, if the world 
is radically indeterministic, there may be no information that fits the strict- 
necessitation model of Dretske. Moreover, if we employ the hyperfinite model 
of information and the conditional-function model of representation, we have 
two independent accounts of the possibility of error. Where error occurs be- 
cause the information is fallible, we can label the case one of malrepresentation. 
Where error occurs because the background condition of the representation is 
not actualized, we can label the case one of misrepresentation. 


9.4 Information as the Basis of Knowledge 


It is possible to define a notion of robust or knowledge-bearing information. A 
fact (s, : ¢) is robustly linked to a fact (sq : w), relative to relation R, just in 
case there is an informational link between the two facts, and this link survives 
the addition of additional actual information to its first term. In other words, 
if we extend s; to some larger situation s, and we take into account not only 
that s is of type ¢, but also that it is of the more specific type ¢&x, then 
there is still an information link (relative to some relation R’) between the facts 


(s:(@&x)) and (sq: W). 
Definition 9.4 (Robust Information Link) 
(81:6) OR (82: P) =der 
(81: @) Rr (82: ))& 
VaVxVR' ((Ar & (81 2 2) & cR's2 & (2|= x)) — (PG & x) SR W)) 


?(Barwise and Perry, 1983, pp. 112-114, 270-272), and (Barwise, 1989, pp. 149-151). 


128 Realism Regained 


Robust or knowledge-bearing information carriage is transitive, that is, it 
satisfies the Xerox principle. In fact, this is so whether we define simple infor- 
mation carriage in terms of conditional probabilities infinitely close to 1 or in 
terms of finite conditional probabilities, despite the fact that, in the second of 
these definitions, simple information carriage is not itself transitive. 

An organism can be said to be designed or adapted for the purpose of acquir- 
ing robust information and not just information simpliciter if that organism’s 
functions include routines for the detection of error and anomaly, in other words, 
if the organism is habitually seeking corroboration or correction of its current 
information state. 


10 


A Look Back, and Ahead 


10.1 The Causal Relation 


In chapter 3, I argued that we have evidence from natural language that the 
relata of causation are situations, parts of the world. These situations call for a 
non-classical, three- or four-valued semantics, which I develop in some detail in 
appendix A. 

In chapters 4, 5 and 6, I developed, respectively, a deterministic, indetermin- 
istic, and probabilistic model of causation, using the formal language defined in 
appendix A. I demonstrated that these models satisfied a number of important 
desiderata for theories of causation, and I tested them against a range of exam- 
ples. I also demonstrated (in chapter 7) that these models make higher-order 
causation possible. 

Chapters 8 and 9 represented applications of the theory to outstanding prob- 
lems. I argued in chapter 8 that all wholly contingent situations have causes, 
and that this points to the existence of a necessary first cause, reviving the 
ancient cosmological argument. In chapter 9, I used my causal language in de- 
veloping an account of natural information and misinformation that can be used 
in explaining the existence of representational states. 

In appendix B, I will turn to the problem of giving an adequate account 
of a useful theory of defeasible or non-monotonic inference. I will show that a 
system of defeasible inference that incorporates my account of causation is able 
to give correct and principled solutions to familiar problem cases, such as the 
Yale Shooting Problem. 


10.2. Against Determinism 


One theme that has recurred in this volume is that of the unacceptability of de- 
terminism. Determinism is the conjunction of two theses: (1) the necessitarian 
conception of causation and (2) the universality of the causation of temporally 
bound situation-tokens. I have offered a number of reasons for rejecting the 
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necessitarian conception. In fact, I have even argued that causes can never 
necessitate their effects. I hold this for several reasons: 


1. The causal priority relation is one of asymmetric necessitation: causally 
posterior tokens necessitate the existence of causally prior tokens. If causes 
necessitated their effects, the asymmetry would be violated. Moreover, 
the mutual necessitation of causes and effects would make their separate 
existence problematic. 


2. The necessitarian model of causation leads to an inflation of causal and 
explanatory connections, as I argued in chapters 4 and 5. 


3. There seem to be coherent thought-experiments involving indeterministic 
causation, in which the cause does not necessitate its effect —- for example, 
Mackie’s indeterministic vending machine M (section 5.6.3). The necessi- 
tation model does not fit well with our commonsense view of causation. 


4, Determinism undermines the veridicality of all deliberation, since it con- 
tradicts the existence of genuinely possible alternative futures (section 
16.8). It sets up a false opposition between causation and agency. 


5. Coherent indeterministic and probabilistic models of causation are avail- 
able (chapters 5 and 6). 


The first reason is based on the principle of asymmetric necessitation of 
causal antecedents. This principle in turn receives independent support from a 
number of sources. 


1. The thesis corresponds with our commonsense notion that the past is fixed 
and the future is open. 


2. The thesis enables us to avoid introducing causal priority as an undefined 
primitive, leading to a more economical ontology. 


3. The thesis corresponds to natural conditions for the transworld identity 
of situation-tokens. 


4, The thesis simplifies the definition of screening off and seems to accord 
with our intuitions about what information is needed in justifying causal 
inferences (appendix B). 


10.3. Spacetime as Constrained by Causation, 
Not Vice Versa 


My definitions of causation and of causal priority have not included any spatial 
or temporal relations. This was a conscious decision, since I wanted to be able 
to use causation in the analysis of space and time. I sketched the beginning of 
such an account in section 4.10.2. 
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I will argue in chapter 18 of part II that a causal theory of spacetime sheds 
new light on the paradoxes of quantum reality. In particular, I will argue that 
the non-locality of quantum influences should come as no surprise, since spa- 
tiotemporal locality is a construction designed to fit (as closely as possible and 
as simply as possible) the network of macrophysical interactions. 

In addition, in chapter 18 ofpPart II I will use the causal theory of this 
volume in an explication of our concept of enduring substances, such as people, 
organisms and artifacts. This explication depends crucially on the priority of 
causation over space and time, since it would be problematic to take space and 
time as given independently of the existence of enduring objects. 

The main argument of chapter 8 reinforces the conclusion that causation is 
independent of spatiotemporal relations. In that chapter, I argued that we have 
good reason to postulate a necessaryfFirst cause of all contingent situations. 
This first cause is presumably non-spatial and timeless (since spatiotemporal 
location would seem to introduce an element of contingency), yet it has genuine 
causal efficacy. 

Another bonus of giving a non-spatiotemporal account of causation is that 
it enables me to build causal theories of our knowledge of extra-spatial objects, 
such as the world of logic, mathematics, and modality. This enterprise will be 
a major part of my project in part II. 


Part II 


Applications to 
Metaphysics, Epistemology, 
and Ethics 


11 


An Overview 


11.1 Teleology as Higher-Order Causation 


The notions of natural teleology and biological function play an increasingly 
significant role in contemporary philosophy, especially in recent theories of con- 
tent and of knowledge. The twentieth century has been characterized by an 
intensifying of efforts at clarifying the logic, semantics, and metaphysics of tele- 
ology, rectifying the unfortunate neglect of the topic in modern philosophy since 
Leibniz. One of the most influential and attractive accounts was that of Charles 
Taylor, in his 1964 The Explanation of Behavior (Taylor (1964)). Taylor’s in- 
fluence can be seen in most contemporary accounts, including those of Larry 
Wright, Andrew Woodfield, and Ruth Millikan. 

According to both Taylor and Wright (1976), a state B occurs for the sake 
of state G just in case (1) B tends to bring it about that G, and (2) B occurs 
because it tends to bring it about that G. This is clearly an instance of higher- 
order causation: the causal connection between B and G figures in the causation 
of instances of B. The formal theory of causation that I have developed in this 
volume was designed specifically to explicate this sort of possibility. 

Ruth Millikan has argued that reliance on this sort of higher-order causation 
makes sense only if we make explicit reference to the past. She argues that clause 
(2) must be replaced by one that reads: 


(2°) the present token of B occurs because past instances of B tended to bring 
it about that G. 


Wright explicitly rejects this amendment, to which Millikan responds: 


Wright says that the formulation “because X does Z” does not re- 
duce to “because things like X have done Z in the past.” Rather, 
we are asked to accept that X might be there now because it is true 
that now X does or X’s do result in Z. How the truth of a propo- 
sition about the present case can “cause” something else to be the 
case at present is not explained. (Millikan, 1989b, page 299, note 7) 
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Millikan overlooks two facts. First, the fact that X’s tend to bring about 
Z is not a fact about the present case: it is a timeless, eternal fact about the 
modal and stochastic structure of the world. Second, Millikan fails to take into 
account the fact that such eternal facts can enter into causal explanations of 
present conditions, as I argued in detail in chapter 7. 


11.2 Teleosemantics 


One of the central problems of philosophy has been that of accounting for the 
possibility of the existence of states with content, i.e., the possibility of repre- 
sentational states. I take a representation to be a state with the teleological 
function of carrying a piece of information. The piece of information is the con- 
tent of the representation. In chapter 9, I gave two accounts of the possibility 
of error: misrepresentation and malrepresentation. 

In the case of misrepresentation, we are dealing with a state that has a 
conditional function: it has the function of carrying a piece of information 
in @ specific set of circumstances. In cases in which these circumstances are 
not present, the state still has the same representational content, even though 
it does not actually carry the information it is supposed to. In the case of 
malrepresentation, the representational state does actually carry the appropriate 
information, but the information itself fails to be veridical. This is possible if we 
use a model of information that employs hyperfinite conditional probabilities: a 
state ¢ carries the information w just in case the conditional probability of y on 
¢ is infinitely close to 1. This model satisfies Dretske’s Xerox principle, that is, 
information carriage is transitive. At the same time, it opens up the possibility 
of misinformation. 

In this part, I will develop this model of representation into a novel account 
of mental states. In chapter 14, I sketch an account of a variety of mental states, 
such as belief, desire, intention, and so on. In chapter 16, I use my account of 
mental representation to explain some of the puzzling features of sensory qualia. 
In particular, I will attempt to explain why qualia are irreducible to physical 
properties. 

I am deeply committed to the view that thought (and mental representa- 
tion) is not dependent on public language. Language is impossible apart from 
the existence of speakers capable of mental representation, but mental repre- 
sentation as such is not dependent on the presence of language. It is certainly 
true, however, that the existence of language greatly enhances our capacity for 
complex and subtle representations. Moreover, I do not favor an account of lin- 
guistic meaning that reduces public meaning to speakers’ meaning, as proposed 
by Grice and Searle. Linguistic meaning consists in certain proper teleofunc- 
tions of phonemes and syntactic structure — most basically, on the function of 
sentences to carry information about described situations when used in asser- 
toric reports. In a sense, the words use us to reproduce themselves successfully 
by fulfilling certain adaptive functions. Complex Gricean communicative inten- 
tions are not essential to the use of language. Beyond this bare sketch, I will 
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say next to nothing about language in this book. 


11.3. The Link between Teleosemantics 
and Epistemology 


By combining my definition of teleofunction with my account of information, 
I can define the semantic content of beliefs and perceptions. A perceptual or 
doxastic state of type ¢ represents that p just in case type ¢ has the teleofunction 
of robustly carrying the information that p. In other words, a belief has content 
p just in case it is of a type whose proper function is that of robustly carrying 
the information that p. Since to be a belief whose content information is carried 
robustly is to be a case of knowledge, we can say that a belief that p is a state 
whose function is fulfilled by being a state of knowing that p. Knowledge and 
belief are thus interdefinable: knowing that p is being in a state of believing 
that p whose function is fulfilled, and believing that p is being in a state whose 
function is fulfilled by knowing that p. This circularity is not vicious, since each 
can be non-circularly defined in terms of the more basic notions of function and 
robust information. 

One consequence of this account of content is the inseparability of semantics 
and epistemology. If knowledge of p is impossible, so is belief that p. Conversely, 
if belief that p is possible, then the normal case of believing that p will be a case 
of knowing that p. A certain kind of global skepticism is therefore incoherent. 
One cannot suppose that we can grasp some domain of propositions without 
supposing that we have the natural capacity for knowledge of that domain. 


11.4 Causal/Teleological Accounts 
of Knowledge 


Since on my account, timeless and non-spatial realities, like the structure of 
modality, can enter into causal relations with spatiotemporal processes, I can 
give a causal account of our knowledge of logic, mathematics, and causal ne- 
cessity that closely parallels causal theories of our knowledge (via perception) 
of spatiotemporal objects. In part IH, I develop a causal theory of logical and 
mathematical knowledge in chapter 15, and of scientific knowledge in chapter 
17. 

I also employ a teleological element in my account of knowledge. When a 
representational state fulfills its function, it constitutes knowledge, not merely 
truth. A case of true opinion is a case of partial teleofunctional failure. Thus, 
we should not think of knowledge as truth plus belief plus some third factor. 
Instead, we should think of true opinion as knowledge minus something (namely, 
the appropriate kind of reliability). I call the resulting theory one of teleological 
reliabilism, since it incorporates the advantages of reliabilism, while avoiding 
the standard objections through the inclusion of a teleofunctional element. 
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11.5 Mental Causation and Qualia 


The most fashionable view of the philosophy of mind today is that of non- 
reductive materialism. I will defend a view that is doubly unfashionable: a 
non-materialist reductionism. I agree with reductionists that we must, in the 
philosophy of mind, seek an illuminating account of the nature of mental ac- 
tion, intentionality, and qualia, but I also agree with most anti-reductionists in 
thinking that the resources available to the materialist are inadequate to this 
task. The solution is to step outside the materialist box. By incorporating 
a theory of concrete causation that involves eternal facts, such as facts about 
modality and objective chance, in the causal order, I propose a novel solution to 
the problem of accounting for mental causation and the gap between physical 
and phenomenological properties. 


11.6 Teleological Accounts of Ethics 


Teleological realism makes possible a very robust form of ethical and moral 
realism. It is not necessary to think of the good as some sort of projection of 
idealized desires or preferences. Instead, the good life for a human being can be 
defined as one in which all of the primary functions of human life are fulfilled 
(that is, the functions that are not corrective or ameliorative in nature, like 
functions for healing or resisting infection). 

Moral goodness consists, as Aristotle recognized, in fulfilling certain teleo- 
functions associated with character, that is, with our ability to make appropri- 
ate choices and carry them out successfully. Moral virtue both contributes to 
happiness and is itself an integral component of happiness, since many moral 
functions are primary functions we possess as human beings. 

There are two reasons for believing that values and moral norms are objec- 
tively real. First, we have a natural tendency to believe ethical propositions, 
and non-cognitive accounts of the meanings of these propositions cannot account 
for the fact that we engage in controversy and argumentation over their truth 
values. Second, objectivity provides a simple explanation of the widespread 
cross-cultural agreement we observe on questions of what is good and praise- 
worthy. 

A standard anti-realist rebuttal to the argument from agreement is to pro- 
pose that the agreement we observe can be explained by means of natural se- 
lection: that cultures following radically different norms are unable to survive. 
However, this response undercuts ethical realism only if such an appeal to nat- 
ural selection is itself compatible with the denial of ethical objectivity. I argue 
that, to the contrary, appeals to natural selection of this kind entail the existence 
of moral teleofunctions that adequately ground the objectivity of morality. 

The teleofunctions of any organism, such as a human, are to a very high 
degree mutually supportive and inter-dependent. The operation of certain func- 
tions, for example, those involved in repair and healing, presupposes the failure 
of other functions. I call these the ‘secondary functions’. Primary functions have 
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no such presupposition. The simultaneous fulfillment of all of an organism's pri- 
mary functions is the state of eudaemonia. In the case of rational animals, such 
as humans, human eudaemonia is the ultimate end of all action. 

Subjective states, such as pleasure, pain, satisfaction, dissatisfaction, and 
sense of malaise or of well-being, are all representational in character. Pleasure 
and the sense of well-being have as their natural function the carrying of the 
robust information that eudaemonia has been at least partially achieved. Pain, 
dissatisfaction, and malaise all have the function of carrying the information 
that some function has failed. Our dispositions to feel pleasure and pain are 
fallible but reliable indicators of the underlying, objective condition. 

Moral virtue is the disposition to make decisions that promote eudaemonia 
in the normal way and under normal circumstances. The exercise of virtue is 
valuable both as a means (as a reliable way of achieving eudaemonia) and as an 
end in itself (as a natural constituent of eudaemonia). We cannot fulfill all of 
our proper functions without fulfilling the natural functions of the will, which 
includes the development and exercise of virtue. 

Moral truths have the power to provide both reasons and motivation, since 
the human capacities for reasoning and desiring have the natural disposition to 
respond to moral truth. It is partly constitutive of being a good reasoner that 
one accept moral claims as providing good reasons to act. Our ultimate aim is 
not up to us, nor merely the product of accidental contingencies. It is our being 
objectively ordered to the ultimate end of human eudaemonia that makes us 
human and thereby constitutes us as capable of desiring and wanting. 


11.7 Enduring Substances as Logical 
Constructions 


Enduring substances are logical constructions whose being is constituted by 
a causal chain of situation-tokens. A chain of situation-tokens constitutes a 
substance history just in case there is some type ¢ realized by each member 
of the chain, and each succeeding member’s being ¢ is causally explained by 
its predecessor’s being ¢. The properties of a substance are always indexed to 
some member of its history. In normal circumstances, this can be adequately 
represented by indexing the corresponding proposition to some point in time. 
However, when time travel is involved, it is necessary to index each substance- 
property attribution to a time and a place. Consequently, it is possible for a 
substance to have incompatible properties at the same time, so long as it has 
the properties at different places. : 
Since space and time are themselves constructions, based on the underlying 
causal relations, it is quite possible for substances to be only intermittently spa- 
tiotemporal. For example, it is coherent to suppose (with the orthodox Copen- 
hagen interpretation of quantum mechanics) that quantum systems, such as 
electrons and other microphysical objects, take on definite position or momen- 
tum only under special circumstances. When unobservable, these microparticles 
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have no spatiotemporal properties, only potentialities for such properties. Con- 
sequently, the principle of the spatio-temporal locality of causation simply does 


not apply to them. 


12 


Teleology as Higher-Order 
Causation 


12.1 Three Definitions of Teleology 


In the last forty years, the theory of teleology and biological function has ex- 
perienced a surprising renaissance in analytic philosophy. Charles Taylor was 
a pioneer in this field through his 1964 The Explanation of Behavior (Taylor 
(1964)). Taylor’s lead has been followed by Larry Wright, Andrew Woodfield, 
and Ruth Garrett Millikan. 

I will use Wright, Woodfield and Millikan as paradigms of three competing 
accounts of the nature of teleological function. These three accounts are the 
causal, the normative, and the Darwinian, respectively. The Darwinian account 
has two versions, one retrospective (Millikan) and the other prospective (Bigelow 
and Pargetter). 


12.1.1 The Taylor/Wright Account 


In the theory developed by both Taylor and Wright (1976), a state B occurs for 
the sake of state G just in case (1) B tends to bring it about that G, and (2) B 
occurs because it tends to bring it about that G. This is clearly an instance of 
higher-order causation: the causal connection between B and G figures in the 
causation of instances of B. 

As I mentioned in the last chapter, Ruth Millikan has argued that this 
account makes sense only if we replace reference to higher-order causation by 
reference to the past, i-e., to the actual history of the state B. She would replace 
clause (2) by this: 

(2’) the present token of B occurs because past instances of B tended to bring 
it about that G. 

Millikan insists on this substitution because she assumes that a cause must 

precede its effect in time. However, as I have argued in chapter 8, it is quite 
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possible for an event in time to be the result, in part, of facts about the modal 
and stochastic structure of the world, and these latter facts cannot be located 
in time. 

To some ears, the notion of causation by timeless facts may sound like an 
oxymoron. If one takes it as an essential part of our concept of causation that 
the relation always holds between items with spatiotemporal location, then I 
will have to make use of some more general notion, such as power or influence. 
One could think of my account as a power or influence theory of teleology, rather 
than as a “causal” one. A state has a teleofunctional character when that state 
is under the power or influence of the appropriate eternal facts, ones involving 
certain causal necessities. 

We can distinguish a number of interesting varieties of teleological connec- 
tion. First of all, we can distinguish between intrinsic and extrinsic purpose. 
For example, the bird of a wing exists for the sake of flying, and this is a case 
of intrinsic purpose. In contrast, seeds serve the purpose of feeding the bird, a 
case of extrinsic purpose. 

Another distinction we can make is that between productive and informa- 
tional functions. The Taylor/Wright definition specifies one important class 
of functions: the productive functions. However, there are also receptive or 
informational functions. For example, the eye has the function of registering 
the existence of certain kinds of objects in the environment. This function is 
a matter not of the eye’s effect on the environment, but of the reverse: of the 
environment’s effect on the eye. In chapter 9, I defined a relation of informa- 
tion (or potential information). We can say that a particular pattern of retinal 
stimulation ¢ has the intrinsic function in s (relative to v) of carrying the infor- 
mation that w just in case the pattern ¢ exists because it carries (in organisms 
of type v) the information w. We might say that when a state occurs that has 
the function for an organism to carry potential information of a certain kind, 
then that information has become actual for that organism. 


12.1.2 From Woodfield and Bedau to Aristotle 


Woodfield (1976) argues that the Taylor/Wright account gives a necessary, but 
not a sufficient, condition for teleofunctionality. He urges that we must add a 
normative element requiring that the functional state contribute to the well- 
being of the organism. An example created by Alvin Plantinga (1993) gives 
some support to Woodfield’s contention. We are to imagine a world in which a 
Nazi-like regime institutes a dysgenics program aimed at a hated minority race. 
A harmful mutation is introduced into the minority population that renders the 
bearer nearly blind, and makes attempted seeing painful. The Nazi breeders 
gradually eliminate all of the members of the minority race without the gene, 
by testing for signs of faulty and painful vision. In such a case, the defective 
gene appears to satisfy Wright’s criterion, since part of the causal explanation 
of the presence of the gene in the population is the deleterious effect of the gene 
on the bearer’s vision. Yet, it would seem odd, at the very least, to say that the 
gene had the function (and not just the effect) of impairing vision. 
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There are a number of other examples that also suggest that the Wright 
definition is too broad. Any stable feature of the inanimate world characterized 
by feedback loops, that is, any genuine case of dynamic equilibrium, will be 
describable as instantiating teleofunctionality, according to Wright’s definition. 
Suppose, for example, that the presence of ice in a rock crevice causes the crevice 
to remain open (this example was suggested by Anil Gupta in conversation). 
In this case, the existence of ice in the crevice is caused by the power of the 
ice to keep the crevice open. The ice has the Wrightian function of keeping the 
crevice open. Similarly, if the rapid flow of water in a channel keeps the channel 
from silting up, we would have to say that the water flow had the function of 
preventing the deposition of silt, since in the absence of that causal connection, 
the silt would prevent the water from flowing so rapidly. In these cases, Wood- 
field would argue, there is no genuine teleofunction, since ice deposits and water 
flows have no welfare. 

If we merely add the condition of welfare-enhancement to Wright’s defini- 
tion, however, we would seem to have only a verbal difference, one definition for 
Wright-functions, and another for Woodfield-functions, with the dispute con- 
cerning only the appropriate meaning for the English word ‘function’. It is 
possible, however, to reconstrue Woodfield’s position as an alternative meta- 
physical account. We could take Woodfield as claiming that there is a meta- 
physically distinguished class of Wright-functions: those that exist because they 
contribute to the welfare of their bearers. Such an account gives a real causal 
role to the property of goodness (goodness for some kind of organism), resulting 
in something very close to Plato’s theory of the good. 

Mark Bedau (1992) has also argued that an evaluative element is essential to 
teleology. Bedau distinguishes “three grades of evaluative involvement.” In the 
first grade of involvement, we define the proper function of ¢ to be w by requiring 
that ¢ brings about ~, and w is good. This adds goodness to a pre-Wrightian, 
dispositional account of function. In the second grade, we incorporate Wright’s 
definition and add that w is good as an additional and separate condition. That 
is, we require that the thing has ¢ because ¢ brings about w, and, in addition, 
that ~ is good. Finally, in the third grade, we include the goodness of » within 
the causal explanation of @: the thing has ¢ because both ¢ brings about y and 
y is good.! 

Let y(v) represent the situation-type in which the welfare of the type of 
organism whose time-slices are of type vu is optimized. We could then define 
a third-grade or Platonic function (relative to kind v) as one in which the end 
promoted also promotes y(v), and the fact that it does so is also causally relevant 
to the existence of the functional state. This additional condition, which we can 
call the ‘Platonic condition’, requires that there be a causal connection between 
w (the Wright-functional end of ¢) and the welfare of the organism (qua member 


1 John Searle is another defender of the thesis that teleological judgments presuppose prior 
normative judgments (Searle (1995)). He argues, for example, that our judgment that the 
function of the heart is to pump blood presupposes a prior commitment to the goodness of 
life itself. 
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of the background kind v). 

This third-grade, Platonic account of teleofunctions could be combined with 
a eudaemonistic conception of the good: a theory that the welfare of any organ- 
ism simply consists in the fulfillment of all of its potential Platonic functions. 
This is not a trivial condition, despite the fact that the definition of Platonic 
function makes reference to welfare. The definition of Platonic functions leaves 
open the question of what the good of an organism consists in. Eudaemonism 
would add to this definition the thesis that the good consists in the fulfillment 
of some particular subset of the organism’s Platonic functions. A Platonic eu- 
daemonism would put Wright functions and the good on a par ontologically: 
neither could be reduced to the other. Although the Platonist could not give 
an ontological reduction of the good to the functional, it is still a substantive 
claim about the good to require that it be identified with the fulfillment of some 
subset of the Wright-functions of the organism. 

In addition, the Platonic account is compatible with the claim that, episte- 
mologically speaking, it is possible to learn about the good of an organism by dis- 
covering its Wright-functions. It might well be that nearly all Wright-functions 
are also Platonic functions: that identifying a state as a Wright-function gives 
us good prima facie grounds for identifying it as a Platonic function as well. 
Conversely, it may be that in many cases, identifying a state as conducive of 
the good of the organism gives us good but defeasible grounds for supposing the 
state to be one of the organism’s Wright-functions. 

If the Platonic condition is satisfied, then transcendent goodness (that is, 
goodness by a transcendent standard, one that is not reducible to other facts) 
would be connected to the causal network of the world, not in the sense that 
something’s being good (by this transcendent standard) gives the thing some 
new causal power, but in the sense that the existence of certain properties of 
things is to be causally explained in terms of their contribution to the well- 
being of the things possessing them. Thus, goodness or well-being would have 
an indirect, second-order causal relevance to concrete events. 

There is an alternative, somewhat more deflationary account of the role of 
goodness in a Bedavian third-grade definition of teleology. A thing is capable of 
well-being just in case the sum of its Wright-functions forms a highly coherent, 
mutually supportive totality. A Wright-function counts as a genuine teleofunc- 
tion just in case it coheres in this sense with the well-being of its possessor. 
This sort of an account also has echoes of Platonic themes, in this case the 
close connection for Plato between well-being and harmony. A thing, like an 
organism, with a largely harmonious set of Wright-functions is capable of well- 
being; inanimate objects, with largely unrelated, discordant Wright-functions, 
are not.? Plantinga’s example of the dysgenic gene can be excluded, since, al- 
though the gene does have a Wright-function, this function does not cohere well 
with the rest of the Wright-functions of its human hosts. 

There is one more refinement that needs to be made, bringing this defla- 


2The harmony, or homeostatic clustering, of human goods plays a central role in Richard 
Boyd’s version of moral realism (Boyd (1997)). 
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tionary account closer to the Platonic one. We need to distinguish between 
those cases in which the Wright-functions of a thing are harmonious, but the 
harmony of the functions is merely coincidental, and those cases in which the 
harmony of the Wright-functions is itself functional, contributing, perhaps, to 
the adaptive fitness of the organism. According to the deflationary account, 
harmony is constitutive of the good. Hence, both cases are cases of organisms 
with a standard of well-being. Alternatively, we might insist that the harmony 
of Wright-functions must itself be explained by reference to the good. This mod- 
erate position we might call an “Aristotelian” theory of the good. According to 
this account, we can define the good of a thing in the following way: 


Aristotelian Definition of the Good 
e A thing has a good if and only if it has proper functions. 


e The good of a thing consists in the successful exercise of its primary proper 
functions. 


Aristotelian Definition of Proper Function: A state @ has the proper function 
w in kind v if and only if: 


1. The fact that things in kind v have state ¢ is causally explained (at least 
in part) by the existence of a causal law linking (¢ &v) to ~ as cause to 
effect (Wright’s condition). 


2. The system of functions (¢;,%;) meeting condition (1) for v forms a 
mostly harmonious, mutually supportive whole, and the (¢, 7) function 
contributes to this harmony. 


3. The existence of things of kind v is causally explained (at least in part) 
by the harmony mentioned in condition (2). 


This Aristotelian definition is stronger than the deflationary account, since it 
requires more than the bare fact of the existence of a harmony among Wright- 
functions. At the same time, it takes on much less ontological burden than 
the full-blown Platonic account, since it does not have to postulate goodness 
as a primitive causal factor that explains the existence of Wright-functions. 
Its combination of sober realism with ontological moderation seems to justify 
calling it “Aristotelian,” at least in inspiration. 

Bedau argues that biology makes use only of first- and second-grade func- 
tions. He denies that third-grade functions have a legitimate place in the mod- 
ern, scientific picture of the world. However, he reaches this conclusion because 
he overlooks the possibility of an Aristotelian version of third-grade evaluative 
involvement. In fact, it is the third grade, understood in this deflationary way, 
that is needed to distinguish the functionality of organisms and artifacts from 
self-perpetuating equilibria in the inanimate world. 

For an organism to have a harmonious set of functions, it is not necessary 
that it have no dysfunctional features, nor do we need to exclude the existence 
of a moderate degree of competition and interference between the organism’s 
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various functions. Let us say that function x harmonizes with system S just 
in case, for many, but not necessarily all, members y of S, the fulfillment of x 
increases the probability of the fulfillment of y, and, for most but not necessarily 
all members y of S, the fulfillment of x does not significantly decrease the 
probability of the fulfillment of y. A system of functions S is harmonious if 
nearly every member x of S harmonizes with S — {a}. 

This definition of harmony is not entirely successful, however, because it 
does not take into account the existence of secondary and tertiary functions. 
For example, the body may respond functionally to a condition in which it has 
suffered massive injuries by radically lowering the metabolic rate. This func- 
tional response is fulfilled only when many other functions have failed; hence, 
the fulfillment of this secondary function significantly lowers the probability of 
the fulfillment of most of the body’s functions, since it entails that these func- 
tions have in fact failed. It is possible that an organism could exist most of 
whose functions were secondary ones. In response, let us say that a function x 
compensates for a set of functions T just in case the successful fulfillment of 
entails that none of the members of T are fulfilled and is causally posterior to 
the failures of the members of T. A function x meta-assists y relative to T just 
in case x compensates for T and the fulfillment of x increases the probability 
of the fulfillment of y, conditional on the failure of the members of T. We can 
then weaken the definition of harmonizing with system by requiring only that 
the function meta-assist some of the members of the system, relative to some 
proper subset of the system. A system is harmonious if most of its members 
harmonize (in the new, weaker sense) with the remainder of the system, and 
many of its members harmonize (in the first, stronger sense) with it. 

An organism fighting off an infection, or infested with a parasite, is the 
locus of two disjoint systems (its own and the parasite’s), each internally har- 
monious, and each in conflict with the other. In cases of symbiosis, we can 
identify two disjoint systems, even though they are mutually supportive, since 
the ancillary connections between the two systems are much fewer and weaker 
than those within each one. Cases such as that of the mitochondria lie on the 
vague boundary between organic unity and close, long-established symbiosis. 

Any organism will suffer from a certain degree of dysfunctionality. The 
standard is one of substantial harmony among functions, not ideal or optimal 
harmony. The function of x is determined not by working out what «x is op- 
timally designed for, but by working out whether the most likely explanation 
for the origin of x involves a causal connection between x and some effect. 
For example, there are cases of selfish DNA, genes that take control of the gene 
replication process, producing multiple copies of themselves on the chromosome, 
despite the fact that they interfere with the organism’s fitness. These selfishly 
antisocial genes constitute a kind of self-perpetuating genetic illness, a chromo- 
somal parasite. The existence of such imperfections in the chromosomal system 
does not pose any challenge to the obvious fact that the function of the system 
includes cell reproduction and protein synthesis. 

For the purposes undertaken in this book, I will use the Aristotelian deft 
initions of good and of proper function as my working hypotheses. I believe 
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that the Aristotelian definition is weak enough to include as proper functions 
everything we would want to attribute as such to organisms and to artifacts, 
while excluding any property in the inanimate, natural world as functional. 


12.1.3. Natural Selection Accounts 


Very roughly, Millikan (1984) defines the relation of functionality in terms of 
actual contribution to the survival and reproduction of the organism’s ancestors. 
The eye has the function of registering information of a certain kind because 
of the fact that similar organs in the ancestors of the organism in question 
contributed to the successful reproduction of those ancestors by registering such 
information. Millikan’s account is explicitly retrospective, which invites certain 
kinds of objections. The first appearance of a new adaptation is always non- 
functional, since it cannot acquire a function until it has actually contributed 
causally to successful reproduction. This applies even to artifacts: if I design a 
widget to perform a task, and it does so and in the very way that I envisaged, 
it still does not have that function until its success at meeting the need for 
such functionality results (say, through the marketplace) in the reproduction 
of duplicate widgets. In addition, on Millikan’s account, once a function has 
been acquired, it can never be lost. The sightless eyes of cave fish still have the 
function of seeing, and words of contemporary English still carry the meanings 
of their Indo-European roots. These results seem counterintuitive. 

One solution would be to make Millikan’s account prospective instead, as 
Bigelow and Pargetter (1987) have done. On their account, a state has a par- 
ticular function if the fact that it tends to produce this result enhances the 
reproductive fitness, here and now, of the organism in question. It is not clear 
that this strategy will work, however, since it is unclear what “reproduction” 
can mean in a purely prospective sense. Millikan has the advantage of being 
able to make reference to an already existing family of similar, self-perpetuating 
structures. Since everything is similar to everything else in some way, it is un- 
clear what “the reproduction of x” can mean, in the absence of some already 
existing class of organisms to which x belongs. 

In addition, it is unclear how to define the ‘present environment’ of the 
organism, and unclear what should provide the baseline for comparison. Bigelow 
and Pargetter tell us that the adaptation should improve the reproductive fitness 
of the organism, but what sort of situation provides the benchmark against 
which we are to measure improvement or deterioration? 

There is, however, a more fundamental problem with all of these accounts: 
the fact that they make the truth of Darwinism a matter of ontological necessity. 
Surely it is possible, in some suitably broad sense, that functional organisms 
come into existence in the way described in the book of Genesis, even if this is 
not the way things happened in the actual world. Moreover, it would seem to be 
possible for there to exist what Richard Sorabji (1964) calls “luxury functions”: 
functions that do not in fact enhance the reproductive fitness of their bearer, 
and that did not enhance the reproduction of its ancestors. For example, the 
capacity to appreciate beauty for its own sake, or the ability to track the truth 
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in metaphysical domains, may be genuine functions of the human mind that 
have nothing to do with reproductive fitness. It is at least. possible that such 
functions exist; our fundamental account of the nature of function should not 
exclude these possibilities. 

Moreover, all accounts involving natural selection, whether retrospective or 
prospective, have the drawback that they cannot count artifacts that are the 
product of an original act of intelligent design as having a function. If an 
inventor designs a new kind of mousetrap, the mousetrap does not have the 
function of catching mice (according to these natural-selection accounts) until 
it has been reproduced in response to demand driven by success in actually 
catching mice (on the retrospective account), or unless it has the propensity 
of being reproduced for this reason (on the prospective account). The Wright- 
based definitions of function have the advantage of covering both the products of 
natural selection and those of one-off intelligent design (whether or not they have 
been or are likely to be reproduced), without the need for any gerrymandered 
disjunctivity. On Wright’s account, the mousetrap has the function of catching 
mice so long as its propensity to do so is a cause of the inventor’s constructing it 
as he did. An actual history of catching mice, or a likelihood of being reproduced 
in the future in response to future success in doing so, is not required. 

It is far more plausible to take natural selection as a mode of explaining 
how it is that functions exist in the world, not as an account of what it is for 
something to be a function. 

Neander (1991) has defended a natural-selection account of teleology as an 
analysis of the concept of function, as it figures in the thinking of contemporary 
biologists. According to Neander, in the specialist language of contemporary 
biologists, the word ‘function’ just means ‘selected for by nature’. If contem- 
porary biologists have made the truth of Darwinism a matter of stipulative 
definition, so that to deny the neo-Darwinian synthesis, one would have to deny 
that biological functions exist, then this would constitute an unjustifiable form 
of dogmatism, setting up a conceptual barrier to any future theory that might 
prove superior to the contemporary synthesis. This stipulation would make 
rational dialogue between Darwinists and contemporary or future critics im- 
possible, since supplanting the present theory would require a conceptual and 
linguistic revolution. 

Moreover, the notions of ‘function’ and ‘natural purpose’ have roles to play 
far beyond the narrow world of biological specialists. Functionality is an im- 
portant concept in our commonsense view of the world, and it is needed (I 
will argue) in an adequate theory of epistemology and ethics. The content of 
such a widely used concept cannot be settled by the linguistic conventions of a 
specialized community. 
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12.2 Darwin: Real or Only Apparent 
Functionality? 


Darwin’s theory of natural selection has been taken in two quite opposing ways 
on the question of its bearing on teleology. The American biologist Asa Gray 
took Darwin’s theory as vindicating the reality of biological teleology, and, in 
a letter to Gray (Gilson, 1984, pp. 80-87), Darwin himself seems to endorse 
this inference. In contrast, many philosophers and scientists, including, most 
recently, Richard Dawkins and Daniel Dennett, have taken the upshot of Dar- 
win’s theory to be that all biological functionality is merely apparent, with 
natural selection explaining the existence, not of real teleology, but only of its 
appearance in nature. 

These two conclusions are most probably based on two different under- 
standings of the nature of teleology. It would seem that those taking the 
Dawkins/Dennett line assume that the existence of a function entails the exis- 
tence of a designer or creator, whose prior intentions, or whose intentions plus 
their effective realization, constitute the functional character of the product. 
Alvin Plantinga, in his recent book Warrant and Proper Function (Plantinga 
(1993)), explicitly affirms the existence of this implication. I have two reasons 
for demurring. First, it seems that something like the accounts of Wright or 
Woodfield are adequate characterizations of functionality, with the products of 
intentional design clearly falling under the definiens, without necessarily ex- 
hausting its extension. Second, I hope to give an account of intentionality in 
terms of teleofunctionality (roughly, a state represents a fact just in case it has 
the function of carrying the corresponding potential information), so accepting 
Plantinga’s analysis would doom such an analysis to vicious circularity. 

Consider again the case of the bird’s wing’s having the function of enabling 
flight. The causal connection between the presence of wings and flight was itself 
a higher-order cause of the successful survival of winged ancestors of existing 
birds. A given stage of a winged-bird organism is caused to be bird-stage, and 
hence is caused to be winged, by these earlier successes in survival. Thus, there 
is an indirect causal connection between the causal connection between wings 
and flight and the presence of wings in the given specimen. Wright’s definition 
is satisfied. Moreover, the bird’s Wright-functions form a harmonious system, 
and this harmony itself contributes to the bird’s fitness. 

The connection via natural selection is indirect and retrospective. If all 
actual teleology is explainable by natural selection alone, we should deny the 
existence of real (as opposed to merely apparent) “luxury functions.” However, 
this would be a consequence, not of an ontological theory (as in Millikan’s case), 
but of biological theory. 

Functions that are explained by natural selection are indirect and retrospec- 
tive, but so are functions that are explained by the intentions of a designer. The 
intentions of the designer mediate between, on the one hand, the causal connec- 
tion between the trait and its effect, and, on the other hand, the existence of the 
functional trait in the product, just as the evolutionary history of an organism 


150 Realism Regained 


mediates between these two in the case of natural selection. 


12.3. Retrospective and Non-Retrospective 
Accounts 


So far, all of the definitions of teleology we have considered have been ret- 
rospective in nature, in the sense that the function of a thing depends upon 
what was involved in causing certain features of that very thing. This would 
mean that teleofunctions do not supervene on the internal organization of a 
thing: two internally indistinguishable systems could have different functions 
due to differences in the causal histories involved. For example, a swamp-bird 
that forms spontaneously, without evolutionary history, has swamp-wings that, 
unlike birds’ wings, do not have the function of enabling flight, even if the 
swamp-bird does soar about with apparent facility. 

Many philosophers, including Dretske and Millikan, are content to bite the 
bullet of this consequence. My inclinations are to try to dodge it. Moreover, 
there is another problem with retrospective natural-selection accounts of teleol- 
ogy. Ironically, they propose an essentially neo-Lamarckian conception of func- 
tion. According to Lamarckian theory, use must always precede function. It is 
only after a particular structure or behavior has proved its usefulness in practice 
that it can be incorporated into the set of adaptations of the individual or popu- 
lation. In contrast, neo-Darwinian theory opens the door to the possibility that 
a function can emerge spontaneously, by fortuitous mutation. Natural selection 
explains not the origin or nature of the function, but its successful perpetua- 
tion. This issue is particularly acute when one attempts to understand systems 
of interdependent functions. Consider, for example, the mutually presupposing 
functions of sexual reproduction. The function of the sperm is to fertilize the 
ovum; and the function of the ovum is to receive the sperm. Neither can operate 
before the other is functional. Hence, it is incoherent to insist that the gametes 
cannot be functional until past instances of each have successfully been used in 
reproduction. 

Indeed, in this case, the difficulty for the natural-selection account of func- 
tionality is especially acute, since there can be no such thing as gametes before 
the functional system of sexual reproduction has been established. Hence, the 
functionality of gametes cannot be explained in terms of the previous history of 
gametes, since there could not, by the very nature of the case, be such a thing. 

It is possible to define the function of a thing without building in any con- 
ditions about the actual causal history of that very thing. Let us say that the 
Aristotelian definition of function given above is the definition of ‘etiological 
function’. Then, we can say that some feature A of some thing x of kind K 
has function F' just in case the objective probability is greater than one-half 
that something with the internal organization specified by K would have been 
caused in such a way as to make F the etiological function of A. For exam- 
ple, the swamp-bird belongs, by virtue of its internal organization, to a class of 
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things B of such a kind that the objective probability is greater than one-half 
that an arbitrary member of B came into existence through the kind of natural 
selection responsible for the existence of ordinary birds. Therefore, even though 
the swamp-bird came about in a very unusual way, a way in which the causal 
powers of its wing-like appendages had no role, we can still say that the function 
of these appendages is to enable the swamp-bird to fly. 

In contrast, if natural processes accidentally produce something internally 
indistinguishable from a very crude arrowhead, we do not have to say that its 
function is to act as the point of an arrow, since the objective probability of the 
accidental production of such a system is non-negligible. The-difference between 
the swamp-bird and the arrowheadlike stone lies in the astronomical difference 
in the objective probabilities of the spontaneous generation of each. 

In the case of systems of interdependent functions, such as those of the 
sperm and the ovum, each individual gamete, even the very originals, are such 
that it is very likely that something so organized resulted from a process that 
included successful reproduction (i.e., favorable natural selection). Even though 
the original gametes had no such selective history, it is far more likely (in terms 
of objective chance) that something so organized is one of the many successful 
descendants of the original mutants that it is a product of favorable mutation. 
Thus, the original gametes were fully functional, despite the fact that their 
actual history included nothing that satisfies Wright’s higher-order condition. 


12.4 Extrinsic Functions and the Extended 
Phenotype 


It is harmonious systems of Wright-functions that give rise to teleology. These 
systems are not entirely internal to the body of a given organism: instead, they 
embrace the pattern of interaction between the organism and its environment. 
Features that explain their own existence via their contribution to survival and 
reproduction I call “intrinsic functions.” Features that explain the perpetuation 
of the organism/environment system, but that would still exist in the absence 
of that system, I call “extrinsic functions.” For example, the structure of the 
human heart has the intrinsic function (relative to the human system) of pump- 
ing the blood. The presence of oxygen in the atmosphere (relative to that same 
system) has the function of providing oxygen to the bloodstream through the 
lungs. The human phenotype extends into the entire ecological niche belonging 
to the human system. 

Teleological functions supervene on the present state of the extended phe- 
notype, and not on the internal state of the organism itself. For example, the 
intrinsic function of the coloration of the viceroy butterfly is to mimic the ap- 
pearance of the poisonous monarch, while the coloration of the monarch has no 
such function, and would not have it even if the chemical basis for color were 
identical in the two species. 

The same structure can have two different functions, by belonging simultane- 
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ously or successively to two different extended phenotypes. The fat of the mouse 
serves the intrinsic function of storing energy for the mouse and the extrinsic 
function of providing nutrition to the cat. The shell of the mollusk first serves 
the intrinsic function of protecting the mollusk and then, after being abandoned 
and adopted by the hermit crab, it serves the extrinsic function of housing the 
crab. 


12.5 Our Knowledge of Teleology 


Our knowledge of teleological connections can be either direct or indirect. If 
a particular instance of functionality is mediated by natural selection or by a 
designer’s intentions, then we can discover the functionality by discovering the 
mediation. If I can demonstrate that a particular organism would not long 
survive if a molecule did not have a particular effect, then I can reasonably infer 
that the molecule has that effect as one of its functions. Similarly, if I learn that 
a competent designer intended the artifact to crush olives, and it does in fact 
crush them, and in the way envisaged, then I have learned that crushing olives 
is one of its functions. 

It is possible to have direct knowledge of a teleological connection, regard- 
less of whether the connection is itself direct or indirect. This direct knowledge 
is simply a matter of inferring a simple causal generalization from a large and 
variegated sample of instances. Suppose that we find, in the members of a popu- 
lation v, a large number of factors, $1, 62,...,¢@n, each of which, independently 
of the others, promotes some effect 7. In such a case, it is reasonable to infer 
that the presence of each factor ¢; is caused by the causal connection between 
g; and w, that is, that each of the ¢;’s has the function of w-ing. This conclu- 
sion can be further confirmed by finding analogous cases: similar populations 
v{,.-+,U;,, in each of which there is a family of properties having common effect 
y; (similar to ~). It can also be confirmed by fitting the functional connection 
between ¢ and w into a network of coherent, mutually supportive functions. 
William A. Dembski’s recent monograph (Dembski (1998)) delineates precise 
criteria for excluding chance as the source of such teleological patterns. 

It is important to recognize that, despite what Daniel Dennett has said about 
the “intentional stance,” it is not necessary to discover states that optimally 
realize some end. Imperfect functions can be discovered with as much objectivity 
and certainty as can optimally designed functions. It is not required that the 
various factors that promote some end ~ do so optimally: all is required is that 
there exist a number of separate factors whose existence can be economically 
explained by reference to their common effect. 

Once it has been established that such a teleological connection exists, the 
focus of inquiry can then be shifted to discovering whether the connection is 
unmediated or mediated by natural selection or explicit intentionality. The 
conclusion that an unmediated connection exists can be supported not only by 
the failure to find a mediating mechanism, but also by the discovery of simple 
laws of teleology from which a wide range of actual connections can be deduced. 


Teleology 153 


12.6 Teleological Natural Kinds 


A natural kind, especially in biology, is best characterized in terms of the func- 
tions of its members. The core of a natural kind is a kind of fixed point. A 
class A is such a core just in case the following relationship holds. Let f(A) be 
the set of functional connections instantiated by every member of A. Then, A 
is a core of a natural kind if and only if A is identical to the set of individuals 
instantiating every member of f(A). The natural kind of which A is the core 
consists of all those things instantiating nearly all of the functional connections 
in f(A) (an admittedly fuzzy set). 

In the case of sexual and other social animals, the situation is somewhat 
more complicated. Certain teleofunctions, such as the male and the female sex- 
ual function, are exercised not at the level of individual organisms, but rather 
at the level of groups of individuals. To get a comprehensive picture of the tele- 
ofunctions involved, we must move from the organismic level to the population 
level, as Elliott Sober (1984) has argued. 

However, Sober erred in setting population thinking against essentialism, as 
though the two were inherently incompatible. In fact, Aristotle himself recog- 
nized, although perhaps with some degree of confusion and indistinctness, that 
for political animals such as human beings, a full account of their natures could 
be given only by studying the structures of their societies. Hence, the Politics is 
the indispensable companion to Aristotle’s Ethics (and even in the Ethics, the 
group phenomenon of friendship plays a crucial role). 

Even granting the importance of sexual, social, and other population-level 
functions, it remains the case that there must be a fairly high degree of overlap 
between the proper functions of the various individual organisms belonging to 
a single natural kind. If we do not require such overlap or similarity, we will 
erroneously count members of two symbiotic species as belonging to the same 
natural kind. 


13 


Causal Theories of Mental 
Content 


In the early 1980s a number of accounts of mental content involving teleology 
and proper function were developed. I will concentrate here on two represen- 
tative theories, those of Millikan and Dretske, and on an influential critique of 
the teleological turn by Fodor. 


13.1 Millikan 


In Language, Thought, and Other Biological Categories (Millikan (1984)), Mil- 
likan proposes both a causal theory of teleological or proper function and a 
teleological account of intrinsic meaning or intentionality. Very roughly, Mil- 
likan proposes that a feature F has proper function P for an organism x (or a 
member of some other sort of reproductive kind) just in case F’s causing P in 
the ancestral history of x contributed to the survival or reproductive success of 
z’s ancestors. F’s causing P is something for which F was selected by nature 
(in Elliott Sober’s terminology). I argued in chapter 14 that Millikan’s theory 
confuses the definition of teleology with its explanation, and that a much sim- 
pler and more comprehensive definition is possible along the lines of work by 
Charles Taylor and Larry Wright. 

Millikan proposes that the content of a sign consists of those conditions in 
the world to which the sign is supposed to “map.” For instance, 


... the sense of an indicative sentence is the mapping functions (in- 
formally, the “rules” ) in accordance with which it would have to map 
onto the world in order to perform its proper function or functions 
in accordance with a Normal explanation (italics hers). (Millikan, 
1984, p. 11) 


It is never very clear (at least, not to me) exactly what a “mapping func- 
tion” is, or what it means for a sign to “map” onto the world. We know that 


155 


156 Realism Regained 


this is what a sign is supposed to do, and that when it does it, the sign is true 
(if a complete sentence) or at least significant (bearing a “real value”), but the 
notion of mapping is seriously underspecified in Millikan’s work. In chapter 16, 
I identify the content of a natural representation with the information that the 
representation is supposed to carry robustly. The notion of robust carriage of 
information, in turn, was made precise in chapter 11. My account of informa- 
tion could be taken as one way of filling in the details in Millikan’s notion of 
“mapping.” 


13.2 Dretske 


In Explaining Behavior (Dretske (1988)), Fred Dretske argues that represen- 
tation is possible only as a result of a learning process by which an organism 
becomes attuned or calibrated to its environment . Dretske rejects the idea that 
a teleological explanation based on natural selection and evolutionary history 
can be explanatory of behavior, and, consequently, he rejects such functionality 
as the basis for an account of mental content. Dretske bases this rejection on 
the possibility of inherited intentionality on Sober’s distinction in The Nature of 
Selection (Sober (1984)) between developmental and selectional explanations. 

Sober argues that natural selection cannot explain why a given organism be- 
haves as it does. Instead, it explains why there exist only organisms who behave 
in certain ways. Natural selection does not make existing organisms behave the 
way they do: it eliminates organisms who act differently. I can explain the 
fact that all my friends drink martinis in two different ways: developmentally, 
by explaining the origin and evolution of each of my friend’s taste in alcoholic 
drinks, or selectionally, by demonstrating that any would-be friend who does 
not drink martinis is eliminated from contention on that basis. Only the former 
sort of explanation really explains the friends’ behavior. 

However, I think this argument moves a little too quickly. We should dis- 
tinguish between relational and absolute selection. Relational selection explains 
why nearly every organism of some kind in a certain location, or bearing some 
other relation to certain reference markers (e.g., by being a friend of mine), pos- 
sess a certain property. Absolute selection explains why nearly every organism 
of some kind in existence possesses a certain property. Absolute selection does 
explain the existence of the behavior that has been selected for, and so should 
count as a legitimate form of explanation of that behavior. 

It is odd, to say the least, to insist, as Dretske does, that the calibration 
that occurs in the lifetime of an organism confers content on its states, but that 
the calibration that occurs over the span of evolutionary history cannot. 

Dretske’s account of the basis of the mental content of beliefs is very similar 
to my own. Dretske proposes that a belief-type M represents that P is the case 
just in case M has the function of indicating that P is the case, i.e., has the 
function of carrying the information that P is the case. The account that 1 
develop in chapter 16 differs from Dretske’s account in two ways: (1) it defines 
content in terms of having the function of robustly carrying the information 
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that P, and (2) it introduces a relation-parameter, fixing the relation between 
the mental event-token and the event-token that is the object of representation. 
The first difference means that when a belief fulfills its functions, it becomes 
thereby not just a true belief, but a case of knowledge. The second difference is 
critical in order to avoid Liar-like paradoxes, besides the fact that it enables the 
account to cohere well with recent work in philosophical linguistics (including 
situation semantics and discourse representation theory). 

Dretske’s account of desires differs quite fundamentally from mine. I take a 
desire to have representational content: a desire for X is a state that represents 
the fact that it would be good for me to have X in the near future. Beliefs 
and desires interact with the faculty of the will, whose function is to generate 
intentions and volitions that successfully guide one toward the fulfillment of 
one’s functions (eudaemonia). A volition to @ is a state whose proper function 
is to cause the performance of an act of ¢-ing. Thus, I follow Bratman (1987) 
in rejecting a simple belief/desire model of intentional action. 

Dretske (1990) defines a desire for X as a state that reinforces behavior that 
results in X, or, more precisely, a state D is a desire for X just in case past 
tokens of D reinforced certain forms of behavior exactly because that behavior 
resulted in tokens of X. I would contend that Dretske’s account is an account 
of tropism, not desire. Dretske’s belief/desire model of behavior is really a 
perception/tropism model. A desire is something that can be weighed and 
evaluated, acted upon or resisted. A desire is one way in which a future state 
can be represented as good, but it is not the only way. A human can act against 
all of her present desires, when she believes (as a result of inference or intuition) 
that the greatest good demands doing so. Dretske’s model leaves out the role 
of the will entirely, oversimplifying the phenomena. 

Dretske argues that he can explain the causal efficacy of mental content, 
since the fact that mental event-type M indicates P can be used in causal 
explanations of the fact that M results in behavior of type B. The argument 
goes something like this: The fact that M is a reliable indicator of P is part of 
the explanation of how M became connected to behavior B as a result of learning 
(operant conditioning). However, as Lynn Baker and others have pointed out, 
there are at least two difficulties with this argument. First, at best it shows that 
what M indicates (the information M carries) is causally relevant to behavior. 
It does not establish that M’s having the content it does, its having the function 
of indicating P, is causally relevant to behavior. Second, it does not show that 
the fact that a given token of M indicates P is causally relevant to that token’s 
causing a token of B. What Dretske’s argument shows is that the fact that past 
tokens of M indicated P is relevant to a present token’s causing B. 

Dretske’s response to these objections is to argue that the present token of 
M’s having the content it does consists in the facts about the token’s history 
that explains the connection between M and B in terms of M’s indicating P. 
Consequently, to say that a certain fact about past tokens (namely, the fact 
that these past tokens of M became connected to B in the organism because 
they indicated P) is causally relevant to the present production of B by a token 
of M is just to say that the fact that the present token of M has the content 
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P is causally relevant to the production of B: the two facts are one and the 
same (Dretske (1991)). However, Dretske is confusing two things: (1) the past 
history of the present token of M that establishes the connection between M, 
P, and B and (2) the present token’s having this history. The first of these is 
causally relevant to the production of B, but the second is not. According to 
Dretske’s definition, having the content of P is a kind of “Cambridge property”: 
it is not an intrinsic property of a present token of M. It is like the property of 
being an x such that Clinton is President: a property that really characterizes 
Clinton, or the world, not z. 

In chapter 16, I argue that the causal efficacy of mental content depends 
on the existence of higher-order functions. For example, our faculties of infer- 
ence involve such higher-order functions: these faculties have the function of 
responding to other functional states (beliefs) according to their contents. The 
contents of beliefs thereby become causally efficacious by virtue of triggering the 
operation of these higher-order functions. Lower animals such as worms, with- 
out such higher-order functions, have states with mental content (perceptions, 
reflexes, and so on), but these contents are not causally efficacious. They help 
us to understand the worm’s behavior functionally, but they contribute nothing 
to our understanding of the etiology of the behavior. When dealing with cogni- 
tively sophisticated organisms, by contrast, knowledge of mental contents can 
be used in constructing perfectly accurate stories about the causal links in the 
production of behavior, since some higher-order functions are associated with 
dispositions to respond to mental content as such. 


13.3 Fodor’s Critique of Teleological 
Semantics 


In A Theory of Content and Other Essays (Fodor (1990)), Jerry Fodor lodges 
two principal objections against the teleological account of mental content. 
First, he argues that this theory is not able to account for the possibility of 
error or misinformation, since natural selection cannot make distinctions of suf- 
ficiently fine grain. Second, he argues that these accounts cannot explain the 
mental content of thoughts in modalities other than belief and desire. 

A particular kind of stimulation of the retina of a frog causes a response 
involving the tongue. Teleologically speaking, the purpose of this response is to 
catch and eat a fly (part of the frog’s normal diet), and the purpose of the retinal 
stimulation is to carry the information ‘Fly at point X’. Natural selection has 
selected this stimulus-response pathway for the purpose of catching flies, because 
the typical cause (in the frog’s ancestral environment) of this stimulation has 
been the presence of a suitable fly in the appropriate position. Apparently, we 
can explain the intrinsic content of the retinal stimulation by reference to this 
teleological function. 

Fodor, however, objects that this account attributes too fine an eye for dis- 
crimination to Mother Nature. Suppose that the frog’s vision cannot distinguish 
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between flies and BBs. Suppose further, as seems plausible, that hovering BBs 
have not been common inhabitants of the frog’s ancestral niche. In this case, the 
retinal pattern carries the information, not only that a fly is probably present, 
but also that either a fly or a BB is probably present. Since things that are 
flies-or-BBs in the frog’s ancestral niche are almost all edible (since they are 
nearly all flies), detecting and trying to eat flies-or-BBs is something natural se- 
lection can select for. Consequently, we have no reason to deny that the retinal 
pattern means ‘Fly or BB at location X’. Since any error will be the result of 
something present that is indistinguishable from a fly, any erroneous represen- 
tation will have a perfectly veridical content, formed by the disjunction of ‘fly’ 
with whatever actually caused the stimulation. 

Fodor’s objection assumes that causation and teleology cannot distinguish 
between two properties that are very reliably co-extensive. This is simply wrong. 
As I argued in part I, causal (and therefore also functional) contexts are ex- 
tremely sensitive to subtle differences in properties. I showed how not even 
logically equivalent properties are intersubstitutable salve veritate in causal 
contexts. From the fact that the presence of flies caused the survival of the 
stimulus-response pathway, plus the fact that flies-or-BBs are co-extensive with 
flies (even reliably, counterfactually so), it does not follow that the presence of 
flies-or-BBs has been a causal factor in the survival of the pathway. Mother 
Nature, in her guise as guardian of causal relations, has very sharp eyes indeed. 

Fodor also point out that states with mental contents, such as the idea 
of a cow, can occur in modalities quite different from belief. For example, I 
could, in a particularly bucolic mood, daydream about a cow. This daydream 
carries no information about the real existence of a certain kind of cow anywhere 
or anytime, nor is it supposed to. Daydreams fulfill some radically different 
purpose. I must concede that Fodor has a good point here, and I do not know 
of any existing teleological account, including mine, that adequately addresses 
the basis of mental content in modalities other than beliefs, desires, intentions, 
and volitions. Perhaps something like Fodor’s language of thought hypothesis 
is correct, and the very same syntactic item occurs in both belief contexts and 
in daydreams. Or perhaps there is some other sort of systematic connection 
between beliefs whose content has a certain form and daydreams with isomorphic 
content. Obviously, a great deal of additional work remains to be done, but 
the apparent success of a teleological account in explaining the nature of the 
content of states closely connected with action gives some reason to hope that 
this success can be repeated with accounts of the content of more remote states. 


14 


Teleosemantics of Mental 
Representations 


14.1 An Overview of Representational States 


There are a variety of mental states in humans and animals that clearly have 
representational content: perceptions, actions and attempted actions, memories, 
thoughts, opinions, doubts, wishes, etc. In this chapter, I will sketch, in a very 
programmatic way, how my accounts of teleofunctionality and information can 
be used to account for mental representation. 

Mental representations fall into a number of significant categories. A dis- 
tinction of fundamental importance is that between cognitive and pre-cognitive 
states. I take the defining characteristic of cognitive states to be the presence of 
sub-sentential or sub-propositional structure: the existence of identifiable com- 
ponents corresponding to subjects and to predicates, to designating terms and 
to predicating expressions. I do not intend to prejudge the question of the pres- 
ence of language of thought in the brain, realized in a particular kind of symbolic 
architecture. Even if the brain’s architecture is thoroughly connectionist, the 
phenomena of the articulation of thought into individual and general concepts 
must be accounted for. 

It is the presence of sub-sentential structure in cognition that enables us 
to formulate and use rules, and to engage in induction and abstraction, as 
well as to reason discursively. A limited amount of reasoning is possible pre- 
cognitively, to the extent that something analogous to negation and disjunction 
is present. However, it is the kind of reasoning modeled in predicate logic that 
gives cognition is characteristic freedom from the particular. 

A four-way distinction among mental states is also useful. First, there are 
states that are immediately involved in action and behavior: motor volitions and 
preparations. It is these states that occur when one tries to perform some ba- 
sic action and fails, and, of course, they also occur when one succeeds. Second, 
there are states that represent some sort of pro-attitude: appetites, desires, goals 
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and intentions. Third, there are states whose function is to register information: 
perceptions and perceptual memory, opinions and intellectual knowledge. Fi- 
nally, there are the second-order cognitions; conjectures, doubts, wishes, fancies, 
fictions, hypotheses, and the like. 

Combining these two sets of distinctions, we can generate the following map 
of mental representations. At the bottom are motor preparations, neural states 
that result in reflexes. These are normally considered sub-mental. They are 
certainly unconscious (lying entirely in the spinal cord and below), but they 
are representational in my sense. At the next level (still pre-cognitive) we find 
perceptions (and perceptual memories) and appetites. These two interact to 
produce motor directions, resulting in behavior. Above these are the four cog- 
nitive categories: opinions, intentions, volitions (both motor and mental), and 
second-order cognitions. There is some connection between perceptions and 
opinions, and between appetites and intentions, and the output of the cognitive 
level is volition, resulting in intentional action. However, the interaction at the 
cognitive level is probably quite complex, involving a good deal of feedback. 


Second-order 
Cognitions 


Intentions 


Motor Directions 
Motor Preparations 


Figure 14.1: The Network of the Mind 


The account of representationality in each of these categories will be some- 
what different. I will focus here primarily on perceptions and opinions, and 
secondarily on reflexes, appetites, and intentions. I will have little to say about 
the other categories, except to say that I expect the account of representation- 
ality to be parasitic in those cases upon the five that I discuss. Most probably, 
something like Harmon’s theory of conceptual-role or functional-role semantics 
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is appropriate for the elements of higher-order cognitive states. However, the 
fact that the roles of these states include connections to first-order states, which 
in turn have information-theoretic and not conceptual-role semantics, suggests 
a way of breaking into the hermeneutic circle of a system of interrelated roles. 

There does seem to be some connection between the representational content 
of higher-order cognitive states and the teleofunction of carrying information 
about certain possibilities. When I hope for some state ¢, my mental state 
has the function of carrying information about what would be involved in the 
possible realization of ¢. This seems to be true of fears, doubts, hypotheses, 
and many other cognitive states. 


14.2 Pre-cognitive Representations 


In the case of reflexes, a neural state occurs whose function is to produce a 
particular movement of the body. The motor preparation (as I call such a neu- 
ral state) carries some minimal amount of information, such as that something 
dangerous or harmful, or nutritious (as in the case of nursing reflex), is located 
at some relative position to the body. The motor preparation exists because, 
in part, states of this kind often carry this information robustly. A state car- 
ries information J robustly just in case it carries that information, and every 
super-situation also carries the same information. Robust information is always 
veridical: ordinary information is reliably, but not infallibly, veridical. 

A perception is a state that also carries information (and, possibly, misin- 
formation), and whose function it is to carry this information robustly. Unlike 
motor preparations, a perception does not have a characteristic movement as 
its effect. Instead, it interacts with other perceptions and with the appetites 
to produce motor directions, resulting in goal-directed behavior. Such behav- 
ior may reflect operant conditioning, but not any more sophisticated form of 
learning. 


Definition 14.1 (Representational Content of Perceptions) 

A perceptual type o has the representational content (R,w) just in case there is 
a type x such that ¢ has the function of robustly carrying the information under 
the condition x that there exists a situation-token of type w in relation R to the 
point of origin. 


Typically, the relation R will give some agent-centered location in space and 
time, e.g., ‘left’, ‘right’, ‘up’, ‘down’, ‘now’, or ‘very recently’. 

My account differs in three subtle but important ways from that of Dretske. 
For Dretske, a type ¢ represents that p if it is the function of ¢ to carry the 
information that p. Thus, for Dretske a representational function is a function 
of carrying information, not of carrying information robustly. According to my 
definitions of information and function, if a state é had the function of carrying 
the information p, then that state would always carry the information p, and 
so malfunctioning in representation would be impossible. The reason for this is 


that on my account, type ¢ carries the information p just in case the probability 
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of p on some state’s being ¢ is infinitely close to 1. Moreover, a type that has 
a given function fulfills that function in almost all cases (with a probability 
infinitely close to 1). Thus, a type ¢ whose function is to carry information p, 
will carry that information with probability 1, which means that the probability 
of p conditional on ¢’s being instantiated is always 1. Consequently, this function 
will always be fulfilled. 

In contrast, if a state has the function of carrying information robustly, then 
it will fail to fulfill that function in any case in which it carries the information 
non-veridically, or even when it carries veridical information but does so in a 
context in which it would be overridden by misleading non-veridical information 
(these are cases corresponding to Gettier-like instances of true belief without 
knowledge). A perception can be ‘true’ or veridical, even if it fails in its function 
to carry its information robustly. Gettier cases of veridical misperception illus- 
trate this possibility. If I perceive that the stick is bent, and it really is bent, but 
the stick would, from other perspectives appear straight, then my perception is 
veridical, but not robust. 

When a representation fails to fulfill its function because the information it 
carries is non-veridical, I call this failure a case of “malrepresentation.” Mal- 
representations are relatively infrequent, since the objective relative probability 
of such a failure is always infinitesimal. 

The distinction between carrying information and carrying that information 
robustly is a very fine one. The conditional probability of one on the other will 
always be infinitely close to one. Nonetheless, the causation is sensitive to such 
extremely fine-grained distinctions, so it is possible for a state to have one but 
not the other as its function. The function of a perceptual state is not only to 
carry certain information, but to do so robustly, since it is the secure possession 
of truth that contributes to the organism’s fitness. 

A second point of difference between my account and Dretske’s lies in my 
use of the conditional functions. For ¢ to have the representational content of 
yw, it is not necessary that ¢ carry the information w absolutely. It is sufficient 
if @ has the teleofunction of carrying the information w on the condition that 
some condition x is also realized. For example, suppose that a rabbit always 
runs in the opposite direction of a certain kind of sound. We should say that 
this sound represents the presence of a predator in the relevant direction, even if 
predatory animals are present only occasionally when the sound is heard. The 
sound represents the location of a predator, since it carries information about 
the location of the predator on the condition that a predator is in fact nearby. 
In chapter 9, I developed a theory of such conditional information. 

On this model, a representation of the content 7 is not simply a state with 
the function of carrying the information that ~, but rather a state with the 
conditional function of carrying the information that y in circumstances y. In 
the case of the flight mechanism, we could say that the associated perceptual 
states have the function of carrying information about the approximate location 
of a predator in circumstances in which a predator is actually present and has 
actually made some significant noise of the appropriate kind. It is the animal’s 
perceptual state plus the teleologically relevant circumstances that carry the 
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information that a predator has a particular location. When the perceptual 
state occurs but the relevant circumstances are not actualized, we can say that 
the state constitutes a misrepresentation. 

This notion of conditional function is closely related to the idea of conditional 
constraints developed by Barwise and Perry,! We can say that a perceptual state 
@ of a token s has the conditional function (conditional on circumstances x) of 
carrying the the information (relative to R) that a token of type w is actual if 
and only if the fact that the conjunction ¢ & y carries the information (relative 
to R) that ~ is realized is causally relevant to s’s being of type ¢. 


Definition 14.2 (Conditional Representation) A state ¢ of situation s has 
the function of representing the type w relative to R and on condition that x if 
and only tf: 

(i) the fact that @ and x are co-instantiated carries the information that 
is realized in relation R to the site of information, 

(ti) the informational constraint expressed by (i) is itself causally relevant to 


s’s being of type ¢. 


When type ¢ has the teleofunction of robustly carrying the information that 
w is realized in relation R on condition that x is also realized, I call the type 
x a “normality” condition for representation ¢. When a representation fails to 
be accurate because its normality condition is not realized, I call the failure a 
case of “misrepresentation,” as opposed to “malrepresentation”, which occurs 
when the normality condition is present but the information carried fails to be 
veridical. When the rabbit is startled by a harmless animal, the perception is a 
case of mis-, rather than mal-, representation. Misrepresentations can be quite 
common, since there is no requirement that the associated normality condition 
of a representational type be objectively probable. 

The third way in which my account differs from Dretske’s is in its explicit 
use of a relation between the token bearing the information and a token which 
the information is about. This relation corresponds to the demonstrative com- 
ponent of meaning as described by J. L. Austin and by Barwise and Perry. This 
relational component undergirds the possibility of de re attitudes and plays a 
crucial role in averting the Liar paradox. 

In the case of an appetite, the information that is carried concerns the needs 
of, or opportunities for some advantage of, the organism. Appetites and aver- 
sions can be thought of as a special category of perception: perceptions that are 
like reflexes, in that they have the additional function of making a particular 
sort of behavior more likely. Unlike motor preparations or motor directions, 
however, appetites do not actually carry the information that the characteris- 
tic sort of behavior will occur, since the frustration and deferral of satisfaction 
of appetites is a regular, and not an exceptional, occurrence. In addition, the 
behavior whose probability is enhanced by an appetite must be described quite 
abstractly: behavior that is likely to meet the represented need or achieve the 


1(Barwise and Perry, 1983, pp. 112-114, 270-272) and (Barwise, 1989, pp. 149-151). 
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represented advantage, given the perceptions and past experience of the organ- 
ism. 

Motor directions, like motor preparations, carry information about the future 
state of the organism’s body. Misinformation occurs when something interferes 
with the execution of the directions. Once again, it is possible for the motor 
direction to succeed in producing its characteristic behavior, and yet fail to 
fulfill its function, if it fails to carry the information about the future behavior 
robustly, if, for example, its success depends on some accidental but fortunate 
factor. 


14.3 Cognitive States: Opinions 
and Intentions 


Intentions carry a kind of conditional information about the future state of the 
organism. An intention to bring about a situation of type ¢ that is R-related to 
the intention is a state whose function it is to carry the information that such 
a situation will be brought about, unless the agent’s opinions are mistaken or 
a new, countervailing intention interposes. Goals can be thought of as second- 
order intentions: intentions to form intentions of a certain kind, if conditions 
for realizing the goal are sufficiently favorable. 

An opinion is a cognitive state with the function of carrying robust informa- 
tion. Knowledge is the normal case: mere opinion is a case of failed knowledge. 
Hence, I do not define knowledge as a special kind of true opinion; rather, it is 
true opinion that must be defined in terms of knowledge. An opinion is a case 
of knowledge just in case it has the function of robustly carrying the informa- 
tion (R,~), and it succeeds in carrying that information robustly. If it carries 
veridical information, but does not do so robustly, it is a case of true opinion. 
If it carries non-veridical information, it is a case of false opinion. 

Since opinion is a cognitive state, each opinion-token realizes a plurality of 
representation-types. In the case of opinions whose content is atomic, at least 
one of these types must be a token-designating type, something analogous to a 
singular term or to a discourse referent in Kamp/Heim discourse representation 
theory (Kamp and Rele (1993), Heim (1990)). Another of the types must be a 
type-designating type, analogous to a predicate or DRT condition. The func- 
tion of a token-designating representational type must make reference to the 
existence of type-designating types, and vice versa. The two sets of functions 
are interdependent. 

Each type-designating type has the function of combining with one or more 
token-designating types, where each of the latter must belong to a different cat- 
egory, such as agent-designating, patient-designating, means-designating, etc. 
In this way we can distinguish between opinions whose content is ‘Antony loves 
Cleopatra’ from those whose content is ‘Cleopatra loves Antony’. Hence, every 
token-designating type must come in several distinct sub-types, corresponding 
to these different sub-sentential roles. 


Teleosemantics 167 


A token-designating type ¢ will typically bear as its content some relation 
Ry. The relation Rg picks out those tokens that are potential referents for 
tokens of type ¢: if s is of type ¢, and Rss’, then s’ is a potential referent for 
s. If g is a true proper name, R will always pick out a unique object. 

In some cases, a type-designating type will bear as its content not a single, 
unitary type, but a more or less loosely defined family of types. In these cases, 
certain members of the family play the role of paradigms of the family, and 
membership in the family of types is determined by phenomenal or cognitive 
“distance” between each type and each paradigmatic member of the family. In 
cases in which the space of types is dense (in the mathematical sense), for ex- 
ample, when the space of types constitutes a continuum, this process of defining 
families gives rise to the phenomenon of vagueness. In a 1994 article in Mind 
(Koons (1994c)), I argued that vagueness can best be modeled by means of a 
four-valued logic, and that accepting the existence of first-order vagueness does 
not force us to postulate an unending regress of higher-order vaguenesses. 


14.4 Mental Representation and Language 


Although I will have little to say in this work about linguistic meaning, I will go 
so far as to endorse (for the most part) the account of linguistic meaning given 
by Millikan in Language, Thought and Other Biological Categories (Millikan 
(1984)). Unlike Grice, I would not attempt to explain the content of the elements 
of public language by reference to a complicated set of beliefs and intentions on 
the part of the speakers of that language. In many cases, we use language more 
or less thoughtlessly. In fact, the metaphor of the word’s using us to perpetuate 
themselves (like Dawkin’s “selfish genes”) is an apt one. Words and syntactic 
structures have the function of carrying certain kinds of information because 
their doing so in the past is part of the causal explanation of our present use of 
them. 

At the same time, I am equally adamant in rejecting the view of those 
(like Sellars, Davidson, or Brandom) who would make mental representation 
dependent on the social practices constituting a public language. Normativity 
is generated by teleology, not by sociology. Indeed, if there were no normativity 
built into the teleofunctional structure of human life, the positive norms of 
social custom would themselves be impossible. The social practices of assertion 
and other forms of communication are only fruitful because we have something, 
namely information, that is worth communicating. 

No doubt participation in a public language increases the flexibility and 
subtlety of our repertoire of mental representations. The linguistic division of 
labor emphasized by Tyler Burge and Hilary Putnam has much to do with 
explaining this fact. Nonetheless, were there no pre-linguistic content, this 
multiplier effect would be nugatory. 
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14.5 The Narrowness of Mental Content 


Is mental content broad or narrow? That is, does mental content supervene 
on the intrinsic condition of the human being? On my account, this turns on 
whether or not teleofunctional states supervene on this same intrinsic condition. 
I am inclined to say that it does — that the function of a state is determined by 
the most likely causal history of the organism in which the state occurs, given 
the intrinsic state of that organism. (See chapter 14.) This would mean that 
all mental content is narrow. 

However, some of our mental contents are linguistically realized. The func- 
tion of a word or other linguistic unit does not supervene on the intrinsic con- 
dition of the word or symbol, nor does it depend only on the intrinsic condition 
of the individual language-user. We must instead look at the linguistic system 
as a whole, as a communal institution. I would argue that the representational 
contents of words supervene on the present condition of the entire language 
community. However, this means that the contents of linguistically-mediated 
mental states are broad rather than narrow. 


14.6 Teleosemantics and the Liar Paradox 


Since I have given a kind of definition of ‘true opinion’, there is the danger that 
I will run afoul of Tarski’s theorem to the effect that truth is indefinable. In 
particular, it would seem to be possible to construct a Liar opinion-token, true 
just in case it is false. 

However, it is essential to my account that every opinion-token realize two 
kinds of types: type-designating and token-designating. Even opinions whose 
content seems to be abstract and eternal, like ‘2 + 2 = 4’, are always opinions 
about some concrete token or other. In the case of mathematical opinions, the 
token in question is an atemporal one, supporting certain kinds of modal con- 
straints. In the case of semantic opinions, the situation-token will be a complex 
one, supporting certain facts about some opinion-token, and also certain modal 
and causal facts of a general, atemporal nature. 

To say that some opinion-token p is true is always to say, of some situation- 
token s, that it is of the making-p-true type. It is the addition of this token 
parameter that enables me to escape the Liar paradox, by simply borrowing 
the work done on such parametric theories of the Liar as those of Barwise and 
Etchemendy (1987) and myself (Koons (1992), Koons (1994a)). If we try to 
formulate a Liar token, an opinion p that carries the content that some situation 
s is of the not-making-p-true type, we reach the conclusion that p is actually 
true, since s is not of that type. However, there is some larger situation s’ that 
does make p true. When we reflect upon p, we are talking about this larger 
situation s’. Hence, there is no contradiction: s is of the not-making-p-true 
type, and s’ is of the contrary type. (If we formulate the Liar in terms of falsity, 
instead of non-truth, we reach the conclusion that the Liar is made false in the 
larger, but not in the smaller, situation.) 


15 


A Causal Theory of Logical 
and Mathematical 
Cognition 


15.1 The Need for a Causal Theory 


Gettier examples in epistemology indicate that a causal element is needed to 
distinguish knowledge from true opinion. The distinction between knowledge 
and true opinion pertains to the domain of logic and mathematics, as well as 
to domains of contingent and temporal truths. Hence, we need to be able to 
appeal to the existence of causal connections of the appropriate kind between 
mathematical facts and our beliefs about them. Paul Benacerraf (1983a) has 
pressed this point as a basis for a critique of realist conception of mathematical 
truth: if mathematical facts are causally inert, we cannot know them. 

Additionally, a causal theory is needed to provide an account of how we 
are able to refer to particular mathematical objects. There are infinitely many 
mathematical structures that provide models of our theories of arithmetic: how, 
apart from a causal connection, are we to explain the fact that we refer to exactly 
one of these structures in arithmetic? This point, like the last one, has roots 
in the work of Benacerraf, in particular his “What Numbers Could Not Be” 
(Benacerraf (1983b)). 

As soon as we try to do so, however, we face a dilemma. If we try to 
identify logical and mathematical facts with contingent, spatiotemporal facts, 
we distort the nature of mathematics and lose that which distinguishes it from 
other sciences, as is illustrated by John Stuart Mill’s version of mathematical 
empiricism. Alternatively, if we locate mathematical fact in a timeless, necessary 
Platonic heaven, we face the daunting task of finding a ladder to make possible 
commerce between the Platonic heaven and cognitive states on earth. Merely 
talking about “a priori knowledge,” or vague allusions to a special capacity 
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of sight or touch — seeing that 2+ 2 = 4, or grasping a mathematical truth 
— fails to give us the kind of substantive theory capable of sustaining the 
knowledge/opinion distinction. 

I will attack one horn of the dilemma, the horn that has rarely been chal- 
lenged, to my knowledge (the sole exception I know of is Penelope Maddy’s 
work, about which I will have some comments in section 15.7 below). I will 
argue that numbers and other mathematical objects are real and are causally 
effective. Information about mathematical objects is conveyed to us causally, 
not by some mysterious faculty of “mathematical intuition,” but through our 
interactions (both sensory and active) with ordinary, everyday situations. My 
view might be described as a kind of naturalistic Platonism, as opposed to the 
mystical Platonism of those who postulate a mysterious, non-natural channel 
to the mathematical as the unique possession of the human mind. 


15.2 Logico-Modal Facts as Causes 


I suggest that it is modal facts that provide Jacob’s ladder between temporal 
events and Platonic truths. In chapters 4 through 7, I developed an account 
of causation according to which modal facts can act as causes of contingent, 
temporal events. Logical and mathematical facts have causal efficacy in their 
modalized forms. For example, consider the law of excluded middle. If a token 
is of type ¢ V —¢, then it must be either of type ¢ or of type —¢, since the 
relation between tokens and types is governed by the extension of the strong 
Kleene truth tables to four-valued logic (the Dunn-Belnap tables). The purely 
disjunctive type never figures as such in any causal chain, as I explained in 
section 4.8.1. An instance of the law of excluded middle is a paradigm case of 
an unnatural or gerrymandered type. However, a situation token can be of the 
type 0(¢V -¢), without being of type ¢ or >¢. This modalized disjunction can 
figure significantly in causal chains. 
Suppose that we have causal laws of the following forms: 


((PAx)I~ ¥) 
(70 A p)l~ ¥) 


Let us suppose that situation s supports both of these laws, plus the types x 
and p. Is s then a total cause of a succeeding state s’ of type ~? Not necessarily. 
In addition to the two causal laws above, s must also support the modal type 
O(¢ V ad). Without this modal fact, the existence of a situation of type w 
does not follow (see section 8.3). For example, there could be an impossible 
situation-token s’ that, from the limited perspective of s, is counted as possible. 
This token s’ might support both ¢ and 7¢. Consequently, it falsifies both of 
these tokens. The two causal laws above could still hold at s, even if s’ is not 
followed by a token of type w, since s’ falsifies the antecedents of both laws. In 
contrast, if s did support the modalized instance of the law of excluded middle, 
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this would be impossible, and every s-possible token would be followed by a 
token of type #. 

If we assume that causal connections between tokens are always associated 
with causal connections between the corresponding types (what I called “Hume’s 
hypothesis” in chapter 5), this would mean that token s cannot cause a token s’ 
of type w unless it supports the modal/logical type O(¢ V —@). Hence, we can 
correctly say that the modalized version of this instance of the Law of Excluded 
Middle was causally relevant to the production of the concrete, temporally lo- 
cated situation s’ of type w. 

The existence of this vertical causal connection does not exclude the simulta- 
neous existence of an ordinary, horizontal connection. To return to the example 
above, there will be in the world a situation s’ that is either of type ¢ or of 
type -@. If this situation s’ also supports the relevant causal law, then it will 
constitute a total cause of the existence of a situation of type w. This will not, 
however, be a case of overdetermination, since the two causal connections occur 
at different levels. The vertical causal connection involving the modal property 
C(¢ V mg) presupposes the existence of some horizontal connection involving 
either ¢ or 7¢. 

My account of logical knowledge depends on two things: postulating the 
existence of logically complex situation-types (negative types, disjunctive types, 
etc.), and postulating that the support relation between tokens and complex 
types is governed by a four-valued interpretation scheme, namely, the Dunn 
tables (and their extensions to the first-order case). 

In some actual situations, the facts are partial: if d represents the presence of 
hydrogen, then neither ¢ nor 7¢, its negation, may be supported in certain parts 
of the world (parts representing features other than chemical ones, for instance). 
There are no actual, nor even any possible, situation-tokens supporting both ¢ 
and —¢; however, this impossibility is itself a fact that may be supported in 
some situations and not in others. 

In order to represent such modal partiality, it is convenient to use impossible, 
overdetermined situation-tokens in our models. I am a realist about modality, 
but (unlike David Lewis) not about possible-but-not-actual situations. What 
is really possible is the realization of a certain type: it is convenient to model 
this fact through the use of fictional objects such as possible-but-not-actual 
situation-tokens. Similarly, in order to model modal partiality, it is convenient 
to make use of the fiction of impossible situation-tokens. 

One major advantage to the Dunn tables (as well as to their three-valued 
counterpart, the strong Kleene tables), for my purposes, is that no formula is 
true in every interpretation, or, in my case, no type is supported by every token 
in every model. There are non-trivial logical consequences in partial logic; for 
example, ¢&w entails ~ &¢. However, there are no logical validities in this 
logic, no conclusions that can be validly drawn from an empty set of premises. 

This means that every classical validity (every type that is supported by 
every totally defined token) corresponds to a piece of modal information that 


1See appendix A. 
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may or may not be supported by a given token. If we slap a O in front of a 
classically valid type ¢, we produce a piece of logical information that can enter 
into causal explanations of concrete events, including our own perceptions and 
beliefs. 


15.3 Knowing How to Infer Correctly 


It is important, for my purposes, to distinguish between logical knowing-how 
(knowing when it is proper to draw a particular inference) and logical knowing- 
that (knowing the necessity of a given classical validity). Knowing-how is to 
be defined in terms of a reliable disposition to draw only the correct (strong- 
Kleene or Dunn valid) inferences, where this disposition has logical reliability 
as its proper function (in the sense given in chapter 12). Knowing-that involves 
knowledge of particular modal facts, which entails the existence of an appropri- 
ate causal link between the particular fact known and the knowing of it. 

A pattern of inference is knowledge-conferring for a given person only if it 
has, as realized in the dispositions of that person’s mind, the teleofunction of 
preserving robust (or, at least, veridical) information. A necessary condition of 
such teleofunctionality is that the cognitive disposition be of such a kind as is 
typically caused by a corresponding constraint in the world. 

We can see the need for such a causal connection by considering Gettier-like 
examples of failed inferential knowledge. Suppose that Max has proved a math- 
ematical theorem by means of the inference rule of modus ponens. However, 
Max used modus ponens only because the rule was recommended to him by his 
astrologer, Morris, and Morris recommended that Max use modus ponens only 
because Max is a Pisces. Had Max been born under any other sign of the zodiac, 
Morris would have recommended, and Max would have used, other, unsound 
rules of inference. Under such circumstances, Max’s use of modus ponens is 
not knowledge-conferring, and so Max’s would-be proof provides him with no 
knowledge of the truth of the theorem. 

For inferential knowing-how to be possible, we must have two constraints, 
one a modal constraint involving logical or mathematical form, and the other 
a causal constraint linking different beliefs in the mind of the mathematician 
on the basis of their content.? Take, for instance, the inference pattern corre- 
sponding to the T axiom of modal logic: inferring ¢ from O¢. This inference 
pattern is realized in the mind of the mathematician as a causal constraint be- 
tween belief-types: Bel(O@)|~ Bel(¢). This inference pattern is an instance 
of knowing how to reason correctly only if it has the teleofunction of mirror- 
ing a corresponding constraint in the world, namely, 0(0¢ — ¢). This modal 
constraint is supported by any situation-token sufficiently rich in its support 
of modal facts. For the causal constraint in the mathematician’s mind to have 
the appropriate teleofunction, it must be such a kind as to be apt to be caused 


2See Barwise and Seligman’s recent book (Barwise and Seligman (1997)) on information 
flow for a powerful mathematical theory of such constraint-mirroring. 
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by the corresponding modal constraint. Fortunately, the account of vertical or 
higher-order causation developed in part I is adequate to the task of postulating 
such higher-order causal constraints between modal constraints on the one hand 
and cognitive causal constraints on the other. 

In this case, there are general facts about natural selection, or general facts 
about the human capacity for trial-and-error learning, that can provide the basis 
for such a higher-order causal constraint. Inference patterns that are reliably 
truth-preserving are the sort of thing nature selects for, and they are also the 
sort of thing that humans are apt to discover on the basis of experience. In 
some cases, our disposition to reason correctly in a certain way is innate, to 
be explained by natural selection, and in other cases, it is learned from the 
individual human being’s experience. Which inference patterns are which is a 
question to be settled by empirical cognitive psychology. 

In logic, mathematics and other formal disciplines, knowing-that can be seen 
as merely a special case of knowing-how. To know a logical truth or mathemat- 
ical axiom, all one needs is to know how to infer that truth from the empty set 
of assumptions. Where the set of assumptions is empty, the modal constraint 
collapses into the simple necessity of the corresponding logical or mathematical 
type, and the cognitive causal constraint is simply the disposition to believe the 
axiom without proof. 


15.4 Is Logic Factual? 


Iam claiming that the subject matter of logic is a domain of fact, specifically, 
of modal fact. There is a long tradition in philosophy, beginning at least with 
Hume, that divides truths into two categories: matters of fact and matters of 
the relation of ideas. Logic is the paradigmatic example of the second category. 

Hume’s distinction depends on the assumption that there can be no necessity 
in the world other than that which is projected on the world by some sort of 
psychological necessity. This in turn was based on Hume’s sensationalist theory 
of concepts: since we have no sensation of necessity (outside of introspection), 
we can have no real concept of it (as applying to external realities). 

It is hard to see how Hume’s distinction can be defended, since if there really 
is no necessity in the world, then there is no psychological necessity either, and 
hence no necessary relation of ideas. Conversely, once we acknowledge that some 
mental representations are possible and others are impossible, what principled 
reason do we have for extending this distinction to extra-mental event-types? 

Another objection to a factual theory of logical truth proceeds in this way: 
whatever is factual is contingent, logical truths are not contingent, and therefore 
logic is not factual. I deny the first step: many modal facts (perhaps all of them, 
if the 55 axioms are sound) are necessary. 
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15.4.1 The Form/Content Distinction 


A Kantian objection to the factuality of logic would be based upon the form/content 
distinction. Logical truth has to do only with the form, and not with the con- 
tent, of the relevant propositions. Factual truth, in contrast, depends on the 
content as well as the form. Some logical forms, like that of a self-contradiction, 
are incoherent, and so do not represent any possible state of affairs. Negations 

of such incoherent forms are true by virtue of form alone, and hence convey no 
information about the actual world. 

I have no problem with the hypothesis that there is such a thing as logical 
form. My claim is simply that the impossibility of the actualization of an 
inconsistent form is itself a matter of fact, by which I mean, the sort of thing that 
can figure in causal explanations. Forms are themselves parts of the world, even 
if (as the Kantian might suppose) only as parts of the mind. The impossibility 
of assembling a representation with a certain logical form, or the impossibility 
of a representation with such a form applying veridically to the world, is itself a 
modal fact about the world (as a whole). Hence, the Kantian has not given an 
account of logical necessity that is innocent of commitment to modal facts. The 
Kantian makes claims about the necessary coherence or incoherence of certain 
forms, and these claims themselves are necessarily true by virtue of their content: 
they make claims about logical form, but do note themselves have the form of 
logical tautologies. 


15.4.2 A Critique of Conventionalism 


Are the principles of logic true by convention, or by virtue of the meanings of 
the words involved, and not by any kind of correspondence to the world? I find 
Quine’s objections to these theses in his classic, “Truth by Convention” (Quine 
(1949)), to be decisive. 

If we say that the principles of logic are merely rules that we adopt, we 
face two embarrassing questions: (1) for what purpose do we adopt these rules? 
and (2) what determines what follows from the rules we adopt? On the first 
question, surely we use the rules of logic we do because we believe that they are 
reliably truth preserving, in both absolute and hypothetical contexts. Modal 
facts about truth and validity, then, stand prior to and apart from our choices of 
logical systems. On the second question, it is surely a matter of logic to decide 
what does and does not follow from any set of rules. Hence, logic itself cannot 
be merely a set of rules. 

If we suppose that the logical conventions we adopt are not themselves log- 
ically complex propositions (e.g., ‘every sentence of the form (p — p) is true’), 
but instead consist merely in a pattern of linguistic behavior, we face the prob- 
lem that our past behavior is finitary, but the set of logical truths and logical 
consequences of our language is infinite. It is impossible for a finite set of de- 
cisions (‘this shall be true’, ‘that shall be false’) to determine the extension of 
logical truth, unless negation, conjunction, quantification, and the other logical 
primitives have a pre-linguistic existence. If (and only if) the latter is so, our 
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conventions can link particular English words and symbols to these pre-existing 
logical operations. But, in this case, the laws of logic are not themselves the 
product of our conventions but have an independent existence. 

If we suppose that the truths of logic are given by a conventional acceptance 
of the classical truth tables, we face the problem that the truth tables presuppose 
the principle of bivalence: the general fact that every proposition is either true 
or false, but not both. This general fact of bivalence is not something we can 
make true by collective fiat. If we say that bivalence is entailed by what we 
mean by “proposition,” then we have no basis for assuming that the class of 
propositions (so specified) is really closed under the operations of negation, 
conjunction, and so on. Every time we encountered a novel sentence, we would 
have to first verify somehow that it really expressed a proposition (according 
‘to our bivalent conception) before we could confidently apply the principles of 
logic to it. 


Transcendent-Basis and Immanent-Basis Conventionalisms 


It is clear that in some sense all of the truths of logic involve conventions, 
as do all the true sentences of any actual language. However, there are two 
ways to understand the role of convention. On the first, the transcendent-basis 
conception (hereafter, ‘TB conventionalism’), conventions link the elements of 
our language to pre-existing forms or structures. Only finitely many links are 
needed, each established via the combination of teleology and information (as 
I described in chapter 14). There are infinitely many truths of logic expressible 
in English because there are infinitely many logical facts comprising the finitely 
many logical elements associated with the logical constants of English. The 
truth-makers of the theorems of logic exist independently of our language and 
its conventions: conventions serve only to link sentences to appropriate truth- 
makers. 

On the opposing view, the immanent-basis conception (hereafter, ‘IB con- 
ventionalism’), there are no such convention-independent facts or truth-makers. 
Our conventions in and of themselves ground the truths of all “analytic” sen- 
tences, including all of the theorems of logic. These truths are somehow imma- 
nent in our practices, taken as contingent facts of social practice, unrelated to 
a realm of Platonic objects and their relations. 

As I have argued above, it is impossible for finitary social practices, by them- 
selves, to ground an infinity of logical and mathematical truths. To make this 
clearer, I will consider in more detail two versions of the immanent construc- 
tionist conception, one drawing on the later work of Wittgenstein, and the other 
on Carnap’s philosophy. 


Quasi-Wittgensteinian Conventionalism 


It is unclear whether Wittgenstein, in the Philosophical Investigations (Wittgen- 
stein (1953)) or Remarks on the Foundations of Mathematics (Wittgenstein 
(1978)), embraces the IB conventionalist view. His primary target in these 
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works seems to be the Cartesian epistemology of Bertrand Russell, especially, 
his emphasis on “knowledge by acquaintance.” In fact, I will argue that IB con- 
ventionalism is inconsistent with certain key Wittgensteinian tenets. Nonethe- 
less, it is possible to use some of Wittgenstein’s ideas to fill out the immanent 
constructionist view, and so J will consider such a quasi-Wittgensteinian view 
in this section. 

A quasi-Wittgensteinian could try to turn aside my objections to immanent 
constructionism by denying that there are in fact an infinity of logical truths. 
He could argue that the infinity is merely potential and so deny that there is 
any infinitary fact in need of explanation. I will argue in this section that the 
quasi- Wittgensteinian conventionalist is implicitly committed to the existence 
of an actual infinity of rules. 

For the quasi-Wittgensteinian, logical and mathematical truths are norma- 
tive, rather than descriptive. They do not say anything; instead, they merely 
show or express the rules or norms inherent in our language-game. Thus, the 
quasi-Wittgensteinian is committed to the thesis that actual human practices 
embody certain norms: that particular practices follow certain rules. 

What sort of thing is a rule, as an element of Wittgenstein’s philosophy? We 
can take a first step toward answering this question by focusing on what I shall 
call the infinitary upshot of a rule. With each rule we can associate an upshot, 
a function that assigns certain values (such as ‘correct’, ‘incorrect’, ‘borderline’, 
and so on) to an infinite set of actions in context (past, present, and future, 
actual and hypothetical). 

The IB conventionalist must simply identify a rule with its corresponding 
upshot. If we were to acknowledge that a rule is something over and above this 
upshot, a thing that somehow by itself determines its upshot, we would have 
moved from the IB to the TB version of conventionalism, since the rule would 
then be a transcendent object to which we become connected by engaging in a 
certain social practice. 

The fact that a particular rule (ie., a particular infinitary upshot) is in 
practice in a particular community (concretely specified) is a fact of a kind that 
I shall call a practice fact. 

If the quasi- Wittgensteinian is to have any advantage over the TB conven- 
tionalist, it is essential that these practice facts be epistemically accessible: they 
must be the sort of thing that we can come to know. Wittgenstein assumes that 
practice facts, like psychological facts, are not something we perceive directly. 
Instead, there are empirical criteria associated with each such possible fact. 
When we observe positive criteria for the practice fact, we have good (albeit 
defeasible) grounds for accepting the corresponding proposition (namely, the 
proposition that rule R is in practice in community C), and we do not need to 
ground this appeal to these criteria on anything more fundamental (such as the 
observation of a positive correlation between the satisfaction of the criteria and 
the truth of the associated proposition). 

A criterial system is a pair of two sets, a set of positive criteria, and a set: of 
negative criteria. Each criterion in turn is a finite set of observable, effectively 
decidable conditions. A criterion cannot itself be infinite, or otherwise inacces- 
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sible to us, since the whole point of postulating criteria is to give us something 
we can use to arrive at reasonable opinions concerning the corresponding fact. 

Each possible practice fact must somehow be linked to a criterial system. 
The crucial question is: how are we to account for these linkages? What do 
they consist in? 

The quasi-Wittgenteinian must insist that the linkage of criterial systems 
to practice facts is itself a normative, rather than a descriptive, matter. It 
is the rules of our language-game, and not any language-independent fact of 
the matter, that constitute these linkages. This means that the linkage or 
association of criterial systems with practice facts is grounded in a set of actions 
that we collectively take, guided by some shared rule. These actions include such 
things as affirming a practice proposition on the basis of a positive criterion, 
rejecting such a proposition on the basis of a negative criterion, appealing to 
criteria in settling questions about practice facts, etc. An action of associating 
a criterial system with a practice fact concerning some domain of action is a 
kind of meta-action. More formally: 


Definition 15.1 (The Meta Relation) If action a belongs to the domain of 
the upshot U of rule R, and action b links a practice fact concerning R with a 
criterial system C, then b stands in the meta relation to a. 


This quasi-Wittgensteinian version of IB conventionalism is incoherent, be- 
cause the following six propositions are inconsistent: 


1. All practice facts (for example, the fact that rule R is practiced in com- 
munity a) are empirically accessible. 


2. A practice fact can be empirically accessible only if there exists a linkage 
of the fact to some criterial system. 


3. Such linkages are entirely the product of rule-governed human actions. 


4, The transitive closure of the meta relation between actions is a partial 
order (transitive and irreflexive). 


5. There are at most finitely many rule-governed actions. 


6. There exist rule-governed actions (i.e., some practice fact is actual). 


Propositions 1 through 3 guarantee that, if there are any rule-governed prac- 
tices at all, there must exist rule-governed meta-actions that link the practice 
fact to a criterial system. The existence of these rule-governed meta-actions 
constitutes a further practice fact, the fact that the rules governing these meta- 
actions are in fact in practice in the community in question. 

This meta-level practice fact, must in turn, be linked by rule-governed human 
actions to a meta-level criterial system. 

According to proposition 4, the meta relation constitutes a partial order, and 
so there must be an infinite hierarchy of distinct practice facts, with an actual 
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infinity of linking actions. Were the hierarchy to terminate after finitely many 
stages, the highest-level meta-practice fact would be actual but epistemically 
inaccessible, contrary to proposition 1. Moreover, proposition 5 states that 
actual human practices cannot encompass such an infinity of actual actions. 
Consequently, the Wittgensteinian must give up proposition 3, and with it, the 
IB conventionalist conception. 

If the Wittgensteinian tried instead to give up either proposition 1 or 2, 
he would be faced with a dilemma. Either he would have to postulate some 
mysterious form of non-empirical intuition by which we come to have grounds 
for accepting practice propositions, or he would have to admit that there are 
truths, expressible in our language, that are in principle inaccessible to us. 
This would mean that the Wittgensteinian was no better off than the most 
dogmatic Platonist. We might as well accept that we have a mysterious faculty 
for intuiting logical and mathematical truths, or that we have a mysterious 
ability to think about things that are utterly inaccessible to us. 

Proposition 4 is perhaps the most doubtful of the five. Couldn’t there be 
a fixed point of the meta relation, an action a that linked a criterial system 
C to some practice fact concerning a rule R, where a itself belongs to the 
domain of the upshot of R? Let’s call an action embodying such a fixed point 
a self-determining action. I will contend in the following section that a self- 
determining action is clearly impossible, given the IB conventionalist hypothesis. 


Reference and Self-Reference without Transcendent Objects 


At this point, we must return again to the question of the ontological sta- 
tus of rules. If rules are practice-independent objects that autonomously fix 
their own infinitary upshots, then a self-determining action a might be possi- 
ble, since a could make reference to a rule R, and R in turn might determine 
an upshot whose domain included a. However, admitting that rules exist and 
determine their upshots independently of our social practices is to abandon the 
immanent-constructionist hypothesis. Rules so conceived are transcendent, Pla- 
tonic objects which are associated, by convention, with certain kinds of human 
action. 

Alternatively, if rules are not transcendent objects that determine their up- 
shots independently of human action, then we must simply identify a rule with 
its upshot. On this conception, rules do not determine but simply are their 
upshots. However, if rules and their upshots are identical, then self-determining 
actions are clearly impossible. A meta action a that associates a practice fact 
involving rule R with a criterial system C’ is an action that operates upon a 
certain rule (f), taken as a completed entity. If R is simply identical to its up- 
shot U, and if the meta action a is included in the domain of U, then a cannot 
have reference to the rule R without vicious circularity. In such a case, action 
a@ cannot associate a practice fact involving R with a criterial system C, since 
the status of a is itself an integral part of R. 

In order to make this point clearer, I must back up a little and discuss how 
the IB conventionalist accounts for the relation of reference for general terms. 


Logical and Mathematical Cognition 179 


Reference is a relation between a physical state (a vocalization or a graphical 
production or a neurological state) and a set of objects (the extension of the 
general term). The physical state I will call the bare proto-signifier, to contrast 
it with the meaning-encumbered act (about which more is to follow). For the 
sake of argument, I am willing at this point to grant to the IB conventionalist 
the right to treat this reference relation as a brute, irreducible fact, brought 
about in some way or other by our shared social practices. 

In contrast, the TB conventionalist or Platonist will understand the reference 
relation to involve an intermediary object, the transcendent type or universal, 
which in turn determines an extension independently of our practices. On the 
TB account, our practices establish a connection (in my view, a causal connec- 
tion) between the bare proto-signifier and this transcendent object. 

The bare proto-signifier by itself has no meaning. It is only the combination 
of the proto-signifier with the reference-induced extension that can be thought 
of as meaningful. Thus, for the IB conventionalist, the meaning-encumbered act 
can be identified with the sum of the bare proto-signifier with set of objects in its 
extension. The bare proto-signifier itself can be a member of this extension — 
there is no reason to insist that the reference relation be irreflexive. However, 
the meaning-encumbered act cannot belong to its own extension (on pain of 
vicious circularity), since it is in part constituted by that very extension. 

The meta relation as I have defined it above is a relation between meaning- 
encumbered acts. Only a meaning-encumbered act can associate a criterial 
system with a practice fact. Hence, for the IB conventionalist, the meta relation 
must be a strict partial ordering. 

Perhaps a legal analogy will be helpful here. Suppose we had a law that 
stated, ‘This act shall be legally binding when it is passed by the legislatures of 
three-fourths of the states’. This law, call it L, attempts to incorporate within 
itself a rule of recognition (in H. L. A. Hart’s sense) that is to apply to itself.* 

A document is not legally binding (it is not a law-encumbered document) 
unless it meets the conditions of some binding rule of recognition. Hence, no 
document can provide a legally binding rule of recognition that provides the 
basis for its own legality. There must be a document or an unwritten rule with 
a prior basis of legality to supply the conditions of recognition for L. The law 
cannot be socially constructed or positive rules all the way down. At some point, 
we must arrive at rules of natural law, which provide a practice-independent 
basis for recognizing certain positive rules as binding. The legal positivist and 
the IB conventionalist face precisely analogous infinite regresses.* 


The Quasi-Wittgensteinian and Infinity 


The quasi-Wittgensteinian is therefore faced with an actual, and not merely 
a potential, infinity of rules and practices. For a practice fact to be actual, 
there must be a further actual practice (at a higher level in the meta hierarchy) 


3Compare section VII of the Constitution of the United States. 


4See chapter 22 for further discussion of the incoherency of legal positivism. 
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assigning criteria to the original practice fact, or else it would be in principle 
unknowable. 

It is arguable that Wittgenstein himself would have rejected proposition 3, 
and with it, the IB conventionalist programme. In the Investigations, Wittgen- 
stein postulates that it is our shared form of life as human beings that provides 
the linkages between practice facts and empirical criteria. This notion of a form 
of life is a notoriously obscure one, and one could take the TB conventionalist 
account developed in this chapter as a fleshing out of Wittgenstein’s proposal. 

The TB conventionalist avoids the infinite regress that plagues the IB con- 
ventionalist, since practice facts are, from the TB perspective, ordinary, natural 
facts. There is no need for human action to make these facts epistemically 
accessible through the conventional association to them of empirical criteria. 
Instead, the connection between actual practice and logical facts is sustained by 
causal connections (including the teleology of mental mechanisms), which can 
be investigated by the normal methods of natural science. 


Kripke’s Quasi-Wittgensteinian Solution 


In his 1982 work on Wittgenstein and rule-following, Saul Kripke (1982) in- 
terprets Wittgenstein as offering a “sceptical solution” to the problem of prac- 
tice facts. Kripke’s Wittgenstein hypothesizes that there are no such facts, 
that practice propositions have no truth-conditions. Instead, they have only 
assertibility-conditions. In a recent book, Robert Brandom (1994) suggests a 
similar strategy. Our social practices generate rules for attributing practice facts 
to portions of reality, given sets of empirical data. 

However, the Kripke-Brandom solution only pushes the problem back a step. 
Kripke and Brandom concede that there is no IB conventionalist solution to 
the problem of constructing the binary relation between bare proto-signifiers 
(observable behavior) and infinitary upshots. They recommend instead that we 
suppose that our social practices construct a ternary relation between empirical 
data, bare behavior, and infinitary upshots. (The ternary relation tells us which 
upshot is rationally attributable to which behaviors, given a body of empirical 
data.) However, this ternary relation is no less infinite or open-ended than is 
the original, binary relation. If finite social practices cannot provide a basis 
for assigning truth-conditions to practice propositions, then, for the very same 
reason, they are inadequate to the task of constructing infinitary assertibility- 
conditions for those propositions. 

Just as only meaning-encumbered acts can associate bare behaviors with 
infinitary upshots, only meaning-encumbered acts can associate empirical data, 
bare behaviors, and infinitary upshots. To avoid the infinite regress, we need 
some meaning-encumbered acts that do not derive their meaningfulness from 
meaning-encumbered social practices. 
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Carnapian Conventionalism 


A similar problem can be uncovered by examining Carnap’s theory of logical 
and mathematical truths in, for example, The Logical Syntax of Language (Car- 
nap (1937)). For Carnap, logical and mathematical truths are analytic, given 
somehow by the rules of a language system. The truth of a logical or mathe- 
matical sentence is an “internal question,” to be answered with reference to the 
associated conventions and norms. We cannot raise the question of the truth of 
these sentences (“But are there really such numbers?”) as an external question. 
Instead, our external questions must concern only the usefulness in practice of 
one language system or another. 

A crucial question for the Carnapian is the status of what I called practice 
propositions above. When I ask, “Is language L being used in community C'?” 
am I raising an internal or an external question? It seems that this must be 
an external question, since it is inseparable from the problem of judging the 
relative usefulness of one language over another. If I say that language L is 
more useful than language L’, my evidence must consist of actual or hypothetical 
populations putting one language or the other into use. 

If we allowed practice propositions to be treated as internal questions, we 
would produce some very anomalous results. For example, suppose language DL 
contains the meaning postulate that any successful population is a population 
using language L. This postulate would seem to trump the crucial external ques- 
tion, making language L maximally useful by self-proclamation. Consequently, 
it seems we must treat practice propositions about LD as external to L. 

However, a question that is external to L must. be internal to something. 
There must be a second, meta-language L' in which the external questions 
about LZ can be formed. This is in effect the point pressed by Quine in his at- 
tack on Carnap’s analytic/synthetic distinction. Quine insisted that the meta- 
language must, by Carnap’s own principles, be a behavioristic one, and Quine 
demonstrated that practice propositions (propositions concerning which mean- 
ing postulates are actually being used in a given population) cannot be decided 
behavioristically. 

There is an independent, and less ad hominem, objection to the Carnapian 
that can be pressed at this point. By Carnap’s principles, there must be ques- 
tions that are external to the meta-language L’, questions concerning practice 
propositions or usefulness assessments about L’. These external questions must 
be internal to a meta-meta-language L”, and so on to infinity. Whether or not 
we are behaviorists, it is surely implausible to think that human social practice 
is adequate to support an actual infinity of meaning postulates, belonging to an 
infinite hierarchy of meta-languages. 

Why must each of these meta-languages be actually used by us? Why 
couldn’t Carnap willingly concede (as Tarski did) that there exist, as abstract 
objects, an infinite series of linguistic systems, each external to its predecessor? 

The inadequacy of this response is evident upon simple reflection. For 
any given linguistic system L , there are an infinite number of external meta- 
languages, assigning different truth values to the sentence ‘Z is useful’. In 
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assessing the actual usefulness of L, we must make use of the external meta- 
language that we ourselves speak. Hence, the infinite regress must be a regress 
of languages in actual use. 

The only way to stop the regress would be to postulate a language for which 
there are no external questions. However, this would be to abandon a funda- 
mental feature of Carnap’s philosophy. Moreover, such a thing is impossible, if 
we accept the IB conventionalist view. The only language about which exter- 
nal questions cannot be raised is one so impenetrably obscure that no one can 
understand it, like the language of Hegel’s philosophy. 


Baseball and other Platonic Objects 


Conventionalists often appeal to the phenomenon of rule-governed games as an 
analogy to the rules of language. Supposedly, no one would be tempted to pos- 
tulate an eternal, Platonic entity corresponding to something as mundane and 
contingent as the game of baseball. If we can understand a rule-governed activ- 
ity like baseball as grounded exclusively in immanent social practices, though, 
the need for a transcendent basis for language is undercut. 

However, I would argue that games like baseball are a perfect example of the 
need for transcendent objects. Indeed, one of the social purposes of having base- 
ball and similar sports is to teach the young the reality of practice-transcending 
rules. Children seem to begin life as social constructivists (or IB conventional- 
ists). When learning to play a game, children are skeptical about the existence 
of disinterested appeals to rules: they suspect that what makes something an 
out depends solely on some local, social consensus. Hence, they often try to 
manipulate the game for their own advantage, or accuse others (even adults 
who are making fair and disinterested applications of the rules) of doing so. At 
some point, a light turns on, and the young player grasps the fact that baseball 
has an integrity that transcends our fallible attempts to realize it. At this point, 
they become zealous for the strict and disinterested applications of those rules, 
even when doing so means a personal loss. They see that the value of playing 
baseball depends on playing by the rules. 

Now, of course, there is an element of conventionality to the rules of baseball. 
We could have played a game with five bases instead of four, or one requiring 
five strikes instead of three for a strikeout. However, the existence of the rules of 
baseball depends on the real existence of logically complex types, of negations 
and conditionals, as well as natural (non-gerrymandered) types involving motion 
and location and time. Without a transcendent basis consisting of these Platonic 
facts, and without our causal access to them, a genuine game like baseball would 
be impossible. 


15.4.3 Logic as the Precondition of Thought 


There is another distinction between logical and factual truths that could be 
contended for: that logical falsehoods are unintelligible, while merely factual 
falsehoods are intelligible. I agree that logical impossibilities are unintelligible, 
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but I do not accept the further inference that this makes logic non-factual. 
There are certain facts, namely the necessary ones, that it is unintelligible to 
deny. I would not limit this to logical necessities: it is unintelligible to deny any 
necessity, whether this is physical or causal or some other sort. It is unintelligible 
to deny that water is water, and it is also unintelligible to deny that water 
is HO. There is a difference between the two: I learn that the one would- 
be conception is unintelligible by learning logic, and I learn that the other is 
unintelligible by learning chemistry. A mental representation can represent a 
real possibility (and so represent intelligibly) only if there is, in the realm of 
modal reality, a possible situation for the representation to be about. Which 
representations represent intelligibly is itself a factual matter, a matter to do 
with the structure of the world. 

When we say that a logical falsehood is “unintelligible” or “incoherent,” 
there are three things we might mean: 


1. It literally cannot be thought. 


2. Believing it would make one vulnerable to Dutch book strategies (in which 
it is possible to lose but impossible to win). 


3. It is logically false. 


The third sense of “incoherent” of course trivially distinguishes logical im- 
possibilities from other impossibilities. I am not denying that the class of logical 
necessities forms a natural and interesting class; I am merely denying that the 
account of logical truth is radically different from the rest of semantics. 

I would deny that logical falsehoods are incoherent in the first sense above. 
People do in fact believe logical-falsehoods, and this is an important and causally 
relevant fact about them. I agree that believing logical falsehoods makes one 
vulnerable to Dutch books, but so does believing any impossibility. The inco- 
herency comes from believing the impossible, not the illogical. 

In any case, even if logic is in some special sense a precondition of all thought, 
this fact is irrelevant to the project of explaining the possibility of thinking about 
and knowing logical truths. If logic is a precondition of all thought, this may 
give me a reason to think logically, but it does not (by itself) explain how it 
is that I know logical truths, or what it is that I am talking about when I am 
doing logic. 


15.4.4 The A-Priority and Unrevisability of Logic 


Although I am defending a causal theory of logical and mathematical knowledge, 
it does not follow that I am committed to an empiricist account, a la John 
Stuart Mill. It is quite possible, and I think probable, that elementary logic 
and mathematics are knowable a priori, and, moreover, that they are in fact 
unrevisable, hard-wired into our minds. I am also not claiming that we know the 
truths of logic by abduction, by inference to the best explanation. Sophisticated 
scientific inferences like inference to the best explanation already presuppose a 
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substantial body of logical knowledge, since otherwise we would not be able to 
judge what is implied or or contrary to a given hypothesis. Logic is largely 
pre-scientific. My point is that the content and the epistemic status of such a 
priori convictions stands in need of some kind of causal explanation. 

The faculty of imagination plays a critical role in the acquisition of new 
logical and mathematical knowledge. I can discover that the sum of five and 
seven is twelve, even though I have never encountered twelve identifiable things 
in one setting. I can imagine two disjoint sets, one of five and the other of 
seven, and discover that their union must consist of twelve individuals. No 
manipulation of physical objects is needed. 

Nonetheless, we can ask: how is it that such features of our faculty of imag- 
ination are knowledge conferring? It must have something to do with the origin 
of the human mind, whether Darwinistically or otherwise. The formation of 
our faculty of imagination must somehow have been influenced by the relevant 
logical and mathematical facts, perhaps as these facts were causally efficacious 
in various episodes in our evolutionary history. 


15.5 Logical and Physical Necessity 


Heretofore I have emphasized the similarities between logical and physical ne- 
cessity. Both are knowable via their causal influence on sequences of concrete 
events. Nonetheless, there are clearly different forms of modality: logical and 
merely physical, to take two examples. It is physically impossible, but logi- 
cally possible, that I should travel faster than the speed of light. Can I give an 
account of the difference between the two? 

It is important in this context to be very clear about what sort of thing 
is it to which we are attributing possibility or impossibility. For example, is 
it a situation-token, a situation-type, or a proposition (the combination of a 
token or tokens with a type)? As an actualist, I believe that the only real 
tokens are actual ones. So, I view merely possible tokens as some sort of logical 
construction, built up from actual tokens and various situation-types. Such a 
construction represents a real possibility just in case the types involved have the 
modal property of being possibly instantiated (or possibly instantiated by or in 
a certain relation to certain actual tokens). Similarly, a proposition is possibly 
true just in case its type is possibly instantiated by its token. Thus, modality 
is primarily a category of property of situation-types. 

A situation-type represents a logical possibility just in case some type logi- 
cally isomorphic to it is possibly instantiated. (By logically isomorphic, I mean 
that one can be transformed into the other through the substitution of non- 
logical elements.) Dually, a situation-type represents a logical impossibility just 
in case no type logically isomorphic to it is possibly instantiated. 

Analogously, a type constitutes a physical possibility just in case some type 
physically isomorphic to it is possibly instantiated. (Physical isomorphism 
means that one can transform one into the other by substitution of non-physical- 
type elements.) We normally include logical structure in our definition of phys- 
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ical structure, resulting in the inclusion of all physical possibilities within the 
class of logical possibilities. However, we need not do so: we could countenance 
certain types as physically possible but logically impossible. For example, it is 
physically possible for an electron to have spin +4, and it is physically possible 
for it not to have spin +4. We could count the type according to which the 
electron both has and does not have spin th as physically possible, but logically 
impossible. 

My point is that possibility and necessity tout court are the basic reali- 
ties. Logical modality and physical modality are two kinds of structure we find 
within the modal reality of the world. They are distinct, but not fundamentally 
different in kind. 

It may be that there is a further difference between logical and physical 
necessity. It may be that physical laws are only contingently necessary, while 
the truths of logic are necessarily necessary. This could happen if we find that 
the laws of physics are themselves the resultant of some more fundamental fact 
(such as the will of God), while the laws of logic (or some of them) are absolutely 
underived. 

A standard distinction between logical and non-logical necessity relies on 
Tarski’s reduction of logical necessity to ‘truth in every model’. The inade- 
quacy of such a model-theoretic approach to logical necessity can be seen by 
considering propositional logic and truth tables. Suppose we try to identify 
logical truth in propositional logic with true in every interpretation, with the 
interpretations of the logical connectives simply stipulated by displaying the 
corresponding truth functions. This theory of logical truth can work only by 
asserting (if only implicitly) that the rows of the truth tables are necessarily 
exclusive and exhaustive of all possibilities. This is something that cannot be 
simply stipulated to be the case. 

For example, consider just negation. If by ‘false’, we mean ‘not true’, then 
the fact that the two rows of the standard truth table for negation are exclusive 
and exhaustive is itself a prior logical necessity, and not simply the product 
of our stipulating a meaning for ‘not’. Alternatively, if ‘false’ does not simply 
mean ‘not true’, then the standard truth table presupposes a substantive thesis 
of bivalence. In this case also, the mutually exclusive and exhaustive nature of 
the rows is not merely a product of convention. The semantic fact of bivalence 
is now something with which we must have some kind of epistemic contact, and 
this fact of bivalence is itself modal in character: we need to know, not only that 
every sentence in a certain class is in fact either true or false and not both, but 
that this holds of necessity. Once again, we encounter a modal fact to which we 
must have epistemic access. 


15.6 From Logic to Arithmetic 


When compared to our knowledge of logic, our knowledge of arithmetic poses 
a new challenge. Arithmetic involves the existence and properties of things, 
the numbers, that seem to exist in a realm causally isolated from our own. 
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However, this appearance may be deceiving. A number is simply a natural kind 
of quantifier complex.5 Numbers and their properties are thereby contained in 
modalized logical facts. Whenever a situation supports a modalized logical fact 
involving quantifiers and identity, that situation also supports an arithmetical 
fact involving one or more numbers. For example, the logical type: 


OfAw(A(x) A B(x)) A ay(A(y) A B(y)) — Szdw(B(z) A B(w) Az # w)] 


corresponds to the arithmetical type 1+1 > 2. The number n is simply a kind of 
quantifier complex occurring in modalized logical facts, a complex consisting of 
n quantifiers whose variables are declared to be pairwise distinct. For instance, 
the following type is of type 3: 


Ardydz[e A y&y Fz&rF#zk&Ol 


The existence of logically complex types of this kind is not the result of 
any human doing. Our capacity to speak a recursive language and to think 
thoughts of arbitrary logical complexity all depends on the prior existence, in 
reality, of corresponding logical complexity. The commitment to an infinity of 
numbers and the commitment to the recursive nature of language are essentially 
the same thing, as Godel and Poincaré long ago realized. If we believe in the 
existence of a recursively defined language containing quantifiers and identity, 
we have already accepted the existence of the number series, since each number 
is simply a kind of quantifier complex producible in such a language. 

In the ancient world, the Pythagoreans and the Eleatic philosophers argued 
over which was more fundamental: numbers or logic. As T. K. Seung (Seung, 
1996, pp. 194-195) has argued, the later Plato reached the conclusion (expressed 
in his “Parmenides”) that the two are inseparable and interdependent. As soon 
as we admit into our logic formulas of arbitrary complexity, i.e., as soon as we 
recognize that we are working with a syntax and semantics that can be defined 
only recursively, we are already committed to the real existence of the natural 
numbers. 

Thus, numbers do have causal influence on the world: they do so by figuring 
in modalized logical facts that constrain what can happen. To posit that every 
number has a successor is to hypothesize that there exist real modal constraints 
of this kind of arbitrary logical complexity. 

Thus, contrary to Hartry Field, arithmetic is not a conservative extension 
of physical theory. Rather, the axioms of arithmetic are an especially bold 
conjecture, a set of infinitary generalizations based on our knowledge of their 
instances. These arithmetical conjectures are confirmed every time we encounter 
novel situations of great complexity and are able to navigate through these 
situations successfully, with arithmetic as one of our guides. 


5 Alternatively, it may be that each number is the causal ground of the existence of a family 
of equinumerous quantifier complexes. 
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15.6.1 The Infinity of the Universe 


Peano’s axioms assert that every number has a successor. We find this law 
confirmed in our experience, but our experience is (perhaps) limited to rather 
small numbers. If the universe were finite, all of the logical types containing 
quantifier complexes of greater cardinality than that of the universe would be 
uninstantiated. It might be argued that we have little ground for believing 
in the existence of such uninstantiated types. If these types do not exist, then 
neither do the corresponding numbers, resulting in a counterexample to Peano’s 
successor axiom. Moreover, the existence of this infinity of numbers is not 
a contingent fact about the world. We need an argument for the necessary 
existence of an infinity of numbers. 

First of all, it is quite plausible to say that science gives us good reason 
to believe in the existence of infinite pluralities, and so in transfinite cardinal 
numbers. Consider for example, the usefulness of the real continuum in physics. 
This argument, however, does not provide grounds for a belief in the necessary 
existence of the numbers. 

Secondly, we can turn to the ancient trick, deployed by Plato in the dialogue 
“Parmenides” and used famously by Frege in the Grundlagen, of using the 
numbers to number the numbers themselves. If we have two things that are not 
numbers, then we can infer that the number two exists. This means that there 
are at least three things: the two non-numbers, and the number two. This type 
is grounded in the number three, which is provably distinct from the number 
two. We now have four things, etc. Thus, if at least two things exist necessarily, 
then an infinity of things do. (In fact, if one thing exists necessarily, and it is 
not a numerical type and so distinct from each of the numbers, including the 
property of oneness, then we can get Plato’s cascade as a necessary existent.) 

How do we know that there exist things that are not numbers? How, to 
use Frege’s example, do we know that Julius Caesar is not a number? The 
causal/modal theory of mathematics gives us a good answer to this admittedly 
odd question. Numbers are necessary existents that impinge upon our expe- 
rience through their incorporation in modal facts. Julius Caesar, and other 
spatiotemporally located substances, are contingent, and are themselves caused 
to be by temporally prior events. These facts give us at least a strong presump- 
tion in favor of treating the two classes as disjoint. 


15.6.2 Kripke and Wittgenstein on Rule-Following 


Kripke (1982) finds in Wittgenstein’s Philosophical Investigations (Wittgenstein 
(1953)) a novel puzzle: how is it that a finite number of acts can fix the content 
of the rule being followed in a given practice? In the case of arithmetic, the set 
of arithmetical calculations that ever have or ever will be performed is finite. 
There are infinitely many different extensions of these data points to the entire 
three-place Cartesian product of numerals. For example, the “quus” function 
differs from addition only on pairs of numbers so large that no one will ever use 
them. What makes it the case that we are following the rule of addition instead 
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of its counterpart quaddition? 

Kripke’s puzzle seems to put the order of explanation the wrong way around. 
It is because we mean addition by “plus” that we are (or should be) following 
the addition rule, not vice versa. The fact that the linguistic and cognitive oper- 
ations in question represent addition is determined by systematic causal connec- 
tions between them and facts in the realm of logical necessity. Our arithmetical 
calculations are (teleologically speaking) supposed to connect in a particular way 
with the set of first-order logical necessities. Natural numbers can be system- 
atically translated into strings of quantifiers, qualified with suitable identity or 
non-identity statements, as in Frege’s logicist programme. Arithmetical calcula- 
tion is supposed to facilitate efficient computation of logical necessities via these 
translations. If these cognitive operations represented quaddition instead, these 
systematic causal connections would be quite different (and a good deal more 
complicated). 

I cannot think of any way of making sense of direct, causal connections be- 
tween bare mathematical facts (situations including only certain numbers and 
some mathematical relations between them, such as ‘7 + 5 = 12’ or ‘3 < 5’) 
and temporally-located events and processes. Instead, the connection is more 
indirect and holistic. Particular logical facts impinge directly on concrete events 
and processes. Implicit in these logical facts are numbers (types of quantifier 
complexes) and their mathematical relations (such as succession and inclusion). 
Representations of numbers and their relations in the mind (which we might 
call “cognitive arithmetic”) are confirmed by their reliability and fruitfulness in 
generating information about first-order logical necessities (via the translation 
of numbers into quantifiers restricted by identity and distinctness conditions). 
Thus, there are two systems, real arithmetic and cognitive arithmetic, whose 
agreement is caused and sustained by a finite number of causal interactions be- 
tween first-order logical facts, facts about concrete necessities and possibilities, 
and cognitive facts constituting our knowledge of these modal facts. 

Arithmetical facts are knowable by virtue of a systematic translation be- 
tween atomic facts about numbers (facts about the value of particular sums 
and products) and modal facts of first-order logical necessity. Each atomic 
fact about numbers can be mapped to a corresponding set of theorems of first- 
order logic (as in standard logicist treatments of arithmetic). However, isn’t 
this systematic translation between arithmetic and logic itself a rule that can 
be quusified? The translation is an infinitary rule, but our actual mathemat- 
ical practice concerns only finitely many instances of this translation scheme. 
Doesn’t the Kripke/Wittgenstein problem arise at this point? 

The answer to this deviant translation problem is to posit that the num- 
bers really exist, and really participate in those modal situations to which the 
translation scheme links them. That is, the numbers 2, 3, and 5 are real con- 
stituents of the modal situation-token that supports the logical necessity of the 
translation of ‘2 + 3 = 5’ into first-order logic. Moreover, since these modal 
situation-tokens enter into causal relations with ordinary events and processes, 
the individual numbers are also implicated in these causal relations. 

However, we still must confront the fact that we humans have interacted 
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Figure 15.1: Flow of Arithmetical Information 


in this way with only finitely many natural numbers. What determines the 
extension of the translation scheme to numbers with which we have had no such 
interaction? The answer would have to take something like the following form. 
The successor relation is a real, quasi-causal relation between numbers. The 
successor relation enters into the very being of all numbers other than 0. Thus, 
in interacting with numbers larger than 0, we are interacting with the successor 
relation itself. Once we reach the point of forming universal generalizations 
about numbers, such as the generalization that every number has a successor, 
our representational state is informed by a causal connection to this successor 
relation (as a constituent of the relevant modal facts). In other words, succession 
is a natural kind, like gold or aardvark. 

This is not to deny that quusified successor-like relations exist. It is merely 
to deny that these quusified relations enter into causal connections with our epis- 
temic states when we form generalizations about numbers and their successors. 
Moreover, there are important causal/explanatory asymmetries between succes- 
sion and quuccession. The fact that every number has a successor explains why 
every number has a quuccessor, but not vice versa. This asymmetry is crucial, 
because we can then appeal to Occam’s razor (see appendix B) to explain why 
it is objectively far more likely that our thoughts about succession are caused 


190 Realism Regained 


by succession and not quuccession. Occam’s razor tells us to minimize the fac- 
tors that we take to be causally relevant to the phenomena to be explained. 
If we hypothesize that quuccession is causally relevant to our thoughts about 
succession, then it would follow that succession is also relevant, since succession 
is needed to explain the structure of quuccession. In contrast, if we hypothesize 
that it is succession that is directly relevant to the causal explanation of our 
thoughts of succession, we need not suppose that quuccession is also relevant. 
Hence, Occam’s razor dictates that the direct connection to succession is the 
best explanation of our basic arithmetical beliefs. 


15.7 Set Theory and Other Branches 
of Mathematics 


Not all of the branches of mathematics will succumb to the same treatment 
as does arithmetic. In the case of geometry and real analysis, for instance, a 
structuralist theory along the lines of Dedekind (1888) seems called for: Eu- 
clidean geometry is about any structure that satisfies its axioms, and similarly 
for non-Euclidean geometries and real analysis. The sort of “elimination” of 
real numbers proposed by Hartry Field (1980) fits into this structuralist pat- 
tern: physics postulates the existence of various systems of physical quantities 
(distance, duration, mass, field intensity) that validate the axioms of real analy- 
sis (under suitable interpretation). Thus, physics, and other empirical sciences, 
verify the existence of structures of certain kinds, and the various branches of 
structuralist mathematics investigate the logical consequences of being struc- 
tures of the postulated kind. 

Consequently, there is no such thing as the real numbers, or Euclidean space, 
as subsistent objects. In real analysis, we do not study the properties and 
relations of a collection of objects (the real numbers); instead, we study the 
properties of any of a class of structures, each of which validates the axioms of 
analysis. In contrast, arithmetic is primarily the study of the natural numbers, 
which are definite objects in their own right, although, secondarily, its results can 
be applied to any omega-sequence (that is, any sequence sharing the structure 
of the natural numbers). The difference between arithmetic and analysis lies in 
the tightness of the connection between arithmetic and logic, via the definability 
of finite cardinality in first-order predicate logic. 

Set theory constitutes a more difficult case. I would lean toward classifying 
it with arithmetic (as having an absolute domain, the sets), rather than with 
the structuralist branches, such as analysis and geometry. Just as numbers 
can be thought of as the grounds of natural families of logical types (types 
involving quantifier complexes of the corresponding cardinality), so too can 
sets be thought of as grounds of natural families of disjunctive logical types. 
For example, suppose the set A is {a9,@1,...,@a}. Set A is the cause of the 
existence of a family of co-extensive types, of which the following is an instance: 
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At(@ =a, Vr=aQV...2 = Aq) 


Let AY be the family of types co-extensive with this one. We can define 
membership as follows: 


Va(z € A (2|= AY)) 


Sets, even infinite sets, are essentially things that can be given by exhaustive 
enumeration. For this reason, something like Zermelo-Frankel set theory must 
be correct, since an enumeration can include only things that exist prior to the 
enumeration, a restriction reflected in the iterative hierarchy of Zermelo-Frankel 
theory. 

How is it that we are influenced causally by sets and their mathematical 
properties? Quantum mechanics may play a key role in elucidating this connec- 
tion. As David Bohm and Basil Hiley have argued, quantum mechanics concerns 
the process by which physical wholes are formed and dissolved (Bohm and Hiley 
(1993)). Thus, in quantum mechanics, it is sets of physical objects, and not just 
objects taken individually, that are causally efficacious. 

The possibility of the formation of physical wholes presupposes a prior meta- 
physical composition of individuals into sets. When a set of physical systems 
meets certain conditions, it constitutes a quantum whole, whose properties are 
not decomposable into the properties of the parts. Our experience of this process 
of physical composition gives evidence of an underlying metaphysical process by 
which arbitrary collections of things constitute a metaphysical whole, i.e., a set. 
Even a nominalist like Field (Field, 1990, page 214), for instance, confidently 
makes use of an axiom asserting the existence of arbitrary sums of spacetime 
regions. 

Presumably, it this feature of quantum mechanics that is ultimately respon- 
sible for the Gestalt phenomenon in perceptual psychology. The formation of a 
quantum whole, comprising a system of perceptual objects and the perceiver’s 
sensory system, is a precondition of the perception of a holistic Gestalt. Gestalt 
perception is thus literally the perception of sets of objects, and not just of the 
objects individually. 

Penelope Maddy (1990) has argued for this latter point: the perceptibility of 
sets of physical objects.® However, this perceptibility is not sufficient to provide 
an account of our knowledge of set theory. We need to have some explanation 
for our knowledge of general facts about sets, of the kind represented by the 
axioms of set theory. In addition, our perception of sets of physical objects does 
not seem to have any direct bearing on our knowledge of the higher ranks of set 
theory, or of the existence of unit sets or the null set. 


5Maddy (Maddy, 1990, page 87) argues that set theory is metaphysically prior to arith- 
metic, since she postulates that numbers are properties of sets. However, the truths of arith- 
metic are already implicit in first-order logic with identity, ever: without any apparent onto- 
logical commitment to the existence of sets. See Field’s “Reply to Maddy” (Field, 1990, pp. 
208-209). 
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I think the solution to these problems can be found by thinking of our 
knowledge of sets of higher ranks as analogous to our knowledge of the future. 
We know the future, not by virtue of any effect of the future upon our minds, but 
through the influence on our minds and on the future by certain common causes. 
Similarly, our experience of the metaphysical combination of physical object into 
sets, without limitation, provides us with access to a general tendency in things 
toward agglomeration. This same process of agglomeration is also responsible 
for the formation of arbitrary sets of higher ranks. Our knowledge of the axioms 
of powersets, separation, choice, and replacement all reflect our knowledge of the 
universality of this agglomeration process. The iterative hierarchy of Zermelo- 
Frankel set theory reflects a quasi-causal process by which new sets, of higher 
and higher rank, are generated through the successive agglomeration of sets of 
lower rank. (This generation of successive ranks does not take place in time, 
but in an order orthogonal to our time line.) Thus, we know that sets of sets 
exist, even though we have no direct contact with such things, since the very 
same tendency of individuals to form sets is at work, both in the physical world 
and in the realm of sets of higher ranks. 

This view of mathematical ontology and epistemology has no revisionist 
implications for the practice of mathematics, unlike many philosophical posi- 
tions, such as intuitionism, finitism, or modal-structuralism. Mathematics is a 
healthy, progressive science, exploring the structure of modal reality in much the 
same way as physics explores the structure of physical forces and interactions. 
Mathematicians need no help from philosophers to do their job. 

However, this version of set-theoretic realism does not by itself settle the 
question of bivalence. A proposition in the language of set theory is determined 
to be true if it is true in the universe limited to a particular rank in the iterative 
hierarchy and its truth is necessarily preserved by the process of agglomeration 
that leads to still higher ranks. A proposition is determined to be false when its 
negation becomes determinately false. There could, in principle, be propositions 
whose truth value never stabilizes in this way but instead fluctuates from truth 
to falsity and back again as the ranks accumulate. 


15.8 Alternatives to Mathematical Realism 


15.8.1 Fictionalism 


Recent so-called nominalists, such as Hartry Field, have argued that the use- 
fulness of arithmetic and the rest of mathematics in calculating logical conse- 
quences does not depend on the postulates of arithmetic’s actually being true: 
it is enough that they are conservative extensions of our non-mathematical the- 
ories. A mathematical theory can be a conservative extension only if it is con- 
sistent. Hence, we must be able to discover, presumably by scientific induction, 
that the postulates of arithmetic are consistent. 

Consistency and conservativeness are themselves mathematical (metalogi- 
cal) properties. Field is not a fictionalist about these properties, nor about 
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the infinitely large collections (physical theories) that bear these properties. 
This seems an arbitrary and unmotivated distinction. Why is the fact that an 
unsurveyable theory is consistent any less problematic ontologically or episte- 
mologically than any fact in number theory? As we know from Gédel’s work, 
metamathematical facts such as a theory’s consistency can in fact be represented 
by theorems in number theory. 

Moreover, can we really find reason for believing the postulates of a mathe- 
matical theory to be consistent without simultaneously finding reason for believ- 
ing them to be true? We find that, time and time again, postulating the axioms 
of arithmetic and using arithmetic in our computations of logical consequence 
is reliable. This sort of reliability is exactly the phenomenon that leads us to 
accept (provisionally) the truth of scientific hypotheses outside of mathematics. 
The only reason Field has for treating mathematics differently is his concern to 
preserve the causal theory of knowledge. Consequently, to the extent that I can 
account for the possibility of causal influence on the part of numbers, I have 
removed all motivation for mathematical anti-realism. 

In any case, Field does not attempt to provide a fictionalist account of our 
logical knowledge. Field considers logic (at least, first-order logic) to be episte- 
mologically and ontologically unproblematic. He takes for granted, for example, 
that the language of our physical theories is recursive, comprising an infinity of 
sentence types. Only finitely many of these types have concrete instantiations: 
how does the nominalist explain our cognitive and epistemic access to this actual 
infinity of logical types? 

Finally, Field argues that we have good reason to believe that the axioms 
of standard mathematics are logically consistent. This reason is empirical: the 
mathematical community over time has succeeded in weeding out many incon- 
sistencies. If our surviving mathematical theories were inconsistent, someone 
would have discovered this inconsistency by now. 

Field’s argument depends on an appeal to the causal efficacy of consistency 
and inconsistency. His inference to the consistency of mathematics is an infer- 
ence to the best explanation. Such inferences presuppose that the consistency of 
mathematics can genuinely cause our repeated failures to find an inconsistency. 
This means that there must. be causal connections between logical features (con- 
sistency, inconsistency) of certain abstract objects (mathematical theories) and 
our minds and behaviors qua mathematicians. If our mental states can be 
causally connected in this way to theories, why not to numbers and sets? 


15.8.2 Structuralism 


According to structuralism, mathematics is really the study of certain kinds of 
structures, namely those structures that satisfy the axioms of the theory. As I 
indicated earlier, I find a structuralist account quite plausible as an account of 
many branches of mathematics, like geometry or analysis. There are only three 
parts of mathematics toward which I am inclined to take a straightforwardly 
realist attitude: logic, arithmetic, and set theory. As far as I know, no one 
has offered a structuralist account of logic. It is possible to think of arithmetic 
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as the theory of omega-sequences, but this seems to miss something important: 
namely, the use of numbers in counting things. As Frege and Russell recognized, 
there is a tight connection between numbers and quantifiers, so tight that I have 
identified numbers with natural kinds of quantifier complexes. 

There is a different problem with taking a structuralist approach to set 
theory: structuralism seems to involve treating special mathematical theories 
as sub-theories of a more general theory of structures. However, set theory is 
itself a theory of possible structures. There is no more general theory in which 
set theory can be embedded. 


15.8.3. Modalism 


The modalists, such as Putnam and Hellman (1989), take mathematics to be the 
study of possible structures. This enables them to avoid the commitment to the 
actual existence of infinitary structures. However, they are still vulnerable to the 
other objections mentioned in the last section. In addition, the modalists owe us 
an explanation of the possibility of our modal thought and knowledge. Merely 
possible structures can have no effect on our minds; how, then, is reference to 
or knowledge of them possible? 


15.9 Why the Human Mind Is Not 
a Turing Machine 


If we learn about mathematical fact by interacting with modal facts, then there 
is no in-principle upper bound to the set of learnable mathematical truths. 
There are no mathematical truths that are in principle unknowable. In light 
of Gédel’s incompleteness theorems, this means that the set of mathematically 
learnable truths is not recursively enumerable, which in turn means that the 
human mind, qua unbounded learner of mathematics, cannot be modeled as a 
Turing machine. 

This rejection of the Turing machine model as adequate for the represen- 
tation of the mathematical mind does not necessitate speculation about exotic 
forms of quantum causation, as Roger Penrose has suggested. It is consistent 
with the sort of causal Platonism that I am defending to hold that the hu- 
man brain can be modeled as a Turing machine, or even as a finite automaton, 
without remainder. The human person, characterized teleologically, cannot be 
extricated from the human environment. Unlike premathematical animals, the 
human environment is, thanks to its logical/modal component, infinitary in 
character. It is essential to the Turing machine model that only finitely many 
squares on the memory tape are actually written on. This means that a Turing 
machine is always being represented as interacting with a finite environment. 
The memory tape is infinitely long, but only a finite segment of the tape is used 
on any actual run of the machine. 

Since the human person cannot be extricated from the human environment, 
and the human environment is infinitely rich in information, any Turing model 
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of the human mind leaves out something essential. The transition from the 
Turing machine model to the Platonic model is analogous to the transition that 
occurred from the finite automaton model of Skinnerian behaviorism to the Tur- 
ing machine model of computational psychology. In both cases, a certain kind 
of idealization takes place, but the idealization is essential to the illuminating of 
essential features of the mind. Presumably, there is some finite bound to poten- 
tial human memory, given the finitude of the cosmos and the ever-encroaching 
fate of thermodynamic heat death. Thinking of the human mind as a Turing 
machine involves ignoring this actual bound on potential memory, since this 
bound is inessential to what the mind is doing when it performs computations. 
Similarly, there may well be a bound, fixed by the physics of the cosmos, on the 
set of mathematical truths that human beings can learn. The Platonic model 
involves ignoring this accidental bound, since doing so is essential to a correct 
characterization of the nature of mathematical thought and knowledge. 


16 


A Teleological Theory 
of the Mind 


In this chapter, I want to look at a number of problems in the philosophy of mind, 
and discuss briefly the relevance of a teleological theory of mental representation 
to these problems, including downward causation, qualia, and free will. 


16.1 The Irony of Non-Reductive Materialism 


It’s inevitable that a discussion of the mind/body problem begin with Descartes. 
It was Descartes who crystallized the mind/body problem as it exists for the 
modern mind. Descartes’s view, of course, is radically dualistic: the mind 
and body are two separate substances, with radically different essences (the 
one, thought, and the other, extension). From Descartes’s point of view, this 
dualism represented an essential first step out of the confusion of the medieval 
synthesis, in which matter was endowed with mind-like attributes (teleological 
properties) and many functions of the mind (such as sensation) were believed 
to be essentially dependent on the cooperation of matter. 

Of course, this dualism extracts a price, a price that the inheritors of moder- 
nity have, in general, been unwilling to pay. The price to be paid is the loss 
of any intelligible and plausible story about the causal links between the mind 
and the body. Descartes resorted to his infamous speculations about the pineal 
gland, while Malebranche took the extreme expedient of denying secondary cau- 
sation altogether. 

It was Hobbes who prefigured the consensus of the twentieth century by 
identifying the activities of the mind with certain motions of matter. Twentieth- 
century materialism has taken a variety of forms: behaviorism, brain state iden- 
tity theory, functionalism, and non-reductive or supervenient materialism. In 
each of these cases, the materialist has accepted Descartes’s anti-Aristotelian 
conception of matter, disputing only his account of the mind. However, since a 
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simple identification of the mind with matter has proved impossible, the dual- 
istic challenge to mental causation remains unmet. 

Mental descriptions cannot be reduced to or translated into physical descrip- 
tions because mental descriptions involve a different level of abstraction. The 
very same mental states and operations can, in principle, be realized in an infi- 
nite number of different physical media. This multiple realizability holds even 
between mental facts and facts about algorithms or computational processes. 

If mental descriptions involve simply a re-description of the physical facts at 
a very high degree of abstraction, they would seem to be causally redundant. 
All the real work of making things happen takes place at the level of the fully 
determinate physical facts. Mental facts merely supervene upon these purely 
physical goings-on and, therefore, are merely ephiphenomal, a causally inert 
shadow cast by physical reality on a linguistic-conceptual screen that abstracts 
from that reality only some general features. The intentional stance (as Dennett 
puts it) is no doubt a useful stance to take, but it gives us no access to the 
causally relevant features of the situation. 

Non-reductive materialism, then, despite its origins in dissatisfaction with 
the failure of Cartesian dualism to explain mind/body interaction, is saddled 
in the end with an equally insoluble version of the very same problem. The 
solution lies in reconsidering both parts of Descartes’s legacy to the modern 
world: not only his account of the mind, but also his account of the physical 
world. Only by taking seriously the realm of teleology in nature can the paradox 
of mind/body dualism be overcome. 


16.2 Supervenience and Type 
and Token Identity 


Supervenience is a relation between classes of types. One class of types is said 
to supervene on a second when, necessarily, which type from the first class is 
realized in a given token is always determined by which type from the second 
class is realized by the same token (or possibly, which types are realized by the 
same and other actual tokens). The key to acquiring a clear conception of super- 
venience is to clarify the meaning of ‘determined by’. We get two quite different 
pictures, depending on how we understand this relation of determination. 

The simplest model of supervenience, given the existence of tokens, types, 
and a three- or four-valued interpretation linking them, is to say that one class 
of types is determined by a second just in case, whenever a token has a member 
@ of the first class, it also has some set of types A from the second class such 
that every token having all the types in A necessarily realizes type ¢. Let’s call 
this relation strict supervenience. 


Strict Supervenience 
Class A of types strictly supervenes on class B iff A and B are 


disjoint, and for every possible state-token s and every type ¢ in A, 
if s E @, then: there exists a subset C of B such that s supports 
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C and for every possible token s’, if s’ supports C, then s’ supports 


o." 


Not every situation-token has a spatial or temporal location. Some tokens, 
for example, realize eternal, non-local facts, such as modal or causal facts. Let’s 
say that two tokens coincide just in case they share the same spatiotemporally 
located parts. Coincidence is thus a weaker relation than identity. We can use 
this weaker relation to define a form of loose supervenience. A class A loosely 
supervenes on class B if and only if, whenever s is a possible token realizing 
some @ in A, and s is part of some world w, there must be a token s’ that 
coincides with s and is also part of w, and some set C of types in B realized by 
s’, such that in every world in which some token realizes every member of C, 
there is some coincident token of type ¢. 


Loose Supervenience 


Class A of types loosely supervenes on class B iff A and B are dis- 
joint, and for every possible token s, and every type ¢ in A, and 
every world w, if s is part of w and s supports ¢, then: there exists 
a subset C of B and a token s’ that is part of w, coincides with s, 
and supports C such that for every world w’ and every token s, that 
is part of w’ and that supports C' there exists a token sq that is also 
a part of w’, that coincides with s,, and that supports ¢. 


Where strict supervenience holds, it warrants an ontological conclusion: the 
members of the supervening class of types are each identical to some (possibly 
infinitary) construction from the members of the second class of types. This 
entailment is supported by the following definition of type identity: 


Type @ is identical to type w iff every possible token supporting ¢ 
also supports ~, and vice versa. 


Loose supervenience, in contrast, warrants no such ontological conclusion. 
I will argue that the relation between the mental and the physical is one of at 
most loose, not strict, supervenience. For example, in the case of the perception 
of color, there is good reason to expect a very reliable connection between the 
quality of the experience and certain neural event-types, since the whole point of 
color perception is to bring us into attunement with certain physical regularities 
in our environment. If there were significant breakdown in the determination 
(loosely speaking) of the mental by the physical, perception could not perform 
its proper function. Whenever a mental event of a certain perceptual type 
occurs, there will be a coincident physical event belonging to one of a definite 
class of neurological types, and whenever one of these neurological types occurs, 
a coincident mental state-token of the corresponding perceptual type will also 
occur. 


\Strict supervenience corresponds to Kim’s modal operator definition of strong superve- 
nience (Kim, 1997a, p. 188). 
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Functional types are higher-order types, involving quantification over first- 
order types. A functional type has the logical form: 


Ag : du(o(v) & (a|= v)) 


In this formula, ¢@ is a metalinguistic variable, placing some condition on 
the type-variable v. A token s belongs to this higher-order type just in case it 
belongs to some type v meeting the condition ¢. A crucial question concerning 
type-identity is this: supposing that the set A consists of all the types meeting 
the condition ¢, and \/ A is the disjunction of the members of A, is the higher- 
order type identical to the disjunctive type \) A? For example, suppose that 
A consists of an infinite set of physical types. Is the type \/ A also a physical 
type, and is the higher-order functional type Ax : ayv(¢(v) & (x|= v)) identical 
to \V A? 

In order to decide this question, we must ask whether there is in fact a set 
containing all the physical types that could possibly satisfy condition A. If we 
are moderate actualists, we should hold that a type exists in actuality only 
if it has a token,? but we should also concede that there could have existed 
types that do not in fact exist. If there are no non-actual types, neither are 
there any sets, such as A, containing non-actual types. Consequently, neither is 
there such a type as the disjunctive type \/ A. Thus, the higher-order property 
Ax : dv(¢(v) & (z|= v)) exists in actuality, but there is no disjunctive type \/ A. 
I will, therefore, deny that higher-order types, including teleofunctional types, 
are identical to any physical type. 

Another argument for the same conclusion is independent of actualism. Even 
if all types that exist in any possible world exist in this one, there may not be 
a set of possible physical types. Consequently, there might exist no disjunctive 
physical type equivalent to a given higher-order type. 

Now I will turn to the thesis of token identity. What does it mean to say 
that mental states are token-identical to physical states? One interpretation of 
this claim is that every token realizing a mental type also realizes at least one 
physical type. However, this interpretation is too weak. Suppose every token 
realizing a mental type had two parts, one physical and one super-physical. In 
this case, every mental token would realize some physical type, by virtue of its 
physical part, but. it would be misleading to say that mental tokens are just 
physical tokens. 

A stronger version of the thesis of token-identity would go something like 
this. Class A of types is token-identical to class B of types just in case, for 
every possible token s and every type ¢ in A, there is a type w in B such that 
8 supports type w, but no proper part of s does so. 

In the remainder of this chapter, I will lay out my reasons for denying both 
the strict supervenience of the mental on the physical and the token-identity 


2 Alternatively, we might hold that a type exists only if there is some token of some type 
that is a determination of the same determinable. Al! that is needed to make this argument 
work is to assume that physical types exist contingently. 
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thesis. At the same time, I will avoid any form of substance dualism or inter- 
actionism. The secret to this solution is the introduction of teleological types, 
which are a kind of higher-order causal property. Mental types are identified 
neither with physical types (not even infinitely complex disjunctions of phys- 
ical types) nor with some super-physical, first-order property (as in classical 
Cartesianism), but with types involving higher-order causation. 


16.3. Downward Causation versus 
Epiphenomenalism 


The problem of downward causation is a major source of worry about causal, 
functional, and teleological theories of the mind. My own account of mental 
representation makes the representational character of a neural state a higher- 
order property of the state, loosely supervenient on the first-order properties 
and the causal and modal facts of the world. This at least suggests that the 
mentality or representationality of the neural states is itself causally inert, riding 
epiphenomenally on top of the “real” causal story, which occurs exclusively at 
the level of physics. 

This is a caricature of the account I have given, however. Mental states are 
causally efficacious, even efficacious at the physical level, and mental properties 
can figure as such in perfectly good causal explanations of physical events. The 
account is quite far from being epiphenomenal. 

There are two separate issues that need to be examined: (1) do mental 
situations cause physical situations, and (2) do mental situation-types figure in 
genuine causal explanations of the instantiation of physical types? 

What are mental-state tokens? They include, but are not identical to, 
physical-state tokens. My position is an inclusion thesis, not an identity thesis. 
In addition to some physical or neural state, a mental state includes informa- 
tion about the somatic and environmental context of this state and about the 
causal/modal structure of the world, with sufficient information to make it ob- 
jectively likely that the causal antecedents of the state fulfill Wright’s definition 
of teleofunction. Thus, a mental situation-token can be decomposed into three 
parts, one entirely atemporal, one temporal and remote, and one temporal and 
local. The atemporal part is the situation supporting the modal and causal 
facts that link the physical-state type with its characteristic effect. The sec- 
ond part gives the relevant physical context of the immediate physical state, 
supporting the fact that there is an objective likelihood that the first part (the 
causal connections) was indeed involved in causing that immediate state (the 
third part). The local, temporal part is the physical state in the brain that has 
the representational function. 

The second part of the mental token is the somatic context of the third 
part: those features of the human body that, taken as a whole together with 
the relevant causal constraints (the first part) make it objectively likely that the 
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third part satisfies the Wrightian definition for proper function.* 

The first and third parts of the mental-state token are always directly in- 
volved in producing the succeeding mental states and behavior of the agent, and 
the second part is at least indirectly involved.4 Hence, the mental states are 
causally relevant to physical events. Epiphenomenalism, in the classic sense, is 
avoided. 

In fact, the main difficulty is explaining not how mental tokens can be causes, 
but how they can be effects. Only one component of mental events can be caused 
by changes both within and without, namely, the localized physical component. 
The other components are unchanged and unaffected through these vicissitudes. 

This account of mental token-causation could be labeled the ‘catalytic the- 
ory’, since the eternal and holistic (non-local) components of the mental events 
participate in the causal process without being affected themselves, as catalysts 
participate in chemical processes without undergoing changes. Mental state M1 
causes the local/physical component of mental state M2. All of the components 
of M1 are involved in the causal connection, but only one component of M2 is 
an effect. However, the other two components of M2 are exactly the same as 
the corresponding components of M1. These two components are the constant 
element in the process. The following diagram illustrates this catalytic theory: 


Ml M2 


Figure 16.1: Mental Causation 


The diagram illustrates a typical mental-state/mental-state causal interac- 
tion. A mental-state token M1 consists of three components, an eternal, modal 
component E1, a holistic, physical component H1, and a localized, physical 


3See my non-retrospective definition of teleological function in section 12.3. 


*In light of the measurement problem of quantum mechanics, it may be that the wider 
physical context, including this second part of the mental token, is always directly involved in 
producing the behavior. In other words, QM is inconsistent with the kind of causal locality 
that underlies an atomistic account of physical causation. See also section 18.5. 
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component Pl. The three jointly produce a localized effect, P2, which in turn 
forms part of a new mental state, M2, when combined with the two unchang- 
ing, “catalytic” elements El and H1. M2 can, in turn, act as a cause of further 
state-tokens. 

In the case of causal explanation (involving types as well as tokens), there 
is in principle no reason why an instantiation of a mental property could not 
be part of a genuinely causal explanation of the instantiation of some physical 
property. In section 5.6.1, I gave an account of heterogeneous causal explana- 
tions, in which the system of classification used in the explanans differs from 
that used in the explanandum. Mental-to-physical explanation is an example of 
such heterogeneous explanation. 

Nonetheless, there is a legitimate basis for worry, since it seems that it is the 
first-order causal properties of a mental-state token that are needed in explaining 
behavior, and not the second-order and holistic properties that are involved in 
determining the token’s teleological and representational character. A neural 
state of a certain physical kind will have the same effect on behavior, whether 
or not it is likely to have been caused in such a way as to have the function of 
representing some information. The representational aspect of the token seems 
to be exclusively hypothetical or backward-looking and to have no bearing on 
the effects of the mental state. 

This worry is quite warranted in the case of organisms with very primitive 
functional organization. For example, an organism that exhibits only reflexes 
has representational states, but the representational character of these states is 
irrelevant to explaining its behavior. The explanation of behavior is possible 
only by making reference to the causal properties of its neural states, not the 
teleological or representational properties. 

However, when an organism has a more interesting mental life by virtue of a 
recursively structured functional organization, functional properties can be used 
in explaining its behavior. For example, if the organism has both perceptions 
and appetites/aversions, the function of the appetites and aversions is, to some 
extent, definable only in terms of the presence or absence of perceptions of 
certain kinds. In other words, we begin to encounter higher-order functions 
(although not yet higher-order representations). An appetite for some state @ is 
a state that tends to lead to behaviors that are represented internally as likely 
to lead to ¢: hence, the effect of the appetite depends on the functional states 
of the organism’s perceptual faculties. This means that, in the presence of an 
appropriate appetite, the functional/representational character of a mental state 
can be used to explain the organism’s behavior. (For a more detailed discussion 
of this sort of example, see section 8.5. See also Tyler Burge’s classic paper on 
the subject, “Individuation and Causation in Psychology” (Burge (1989)).) 

By way of illustration, consider the higher-order property of fragility. In 
explaining why a particular object breaks, there is no need to posit the existence 
of a distinct, dispositional property, over and above the particular physical and 
chemical features of the object that explain its fragility. However, suppose there 
is a warehouse to which a variety of kinds of glass objects are sent. As each 
shipment arrives, samples of the objects are subjected to a variety of tests, whose 


204 Realism Regained 


function is to identify the property of fragility. Shipments determined to be 
fragile are stored in the north side of the warehouse, and non-fragile shipments 
are stored in the south side. In this case, thanks to the participation of the 
objects in a higher-order functional system, the property of fragility is causally 
explanatory of the location of shipments within the warehouse, and hence we 
have good reason to posit the existence of such a dispositional property. 

It is true that whenever the organism’s behavior can be explained causally 
in terms of the functional and representational properties of its internal state, 
the behavior can also be explained solely in terms of the physical and first-order 
causal properties of that state. There is also some sense in which the first- 
order explanation is “more fundamental” than the higher-order explanation. 
However, this fact does not render the explanation in functional terms non- 
causal, or merely heuristic. Nor does it entail the occurrence of some odd sort 
of overdetermination. The two explanations do not compete with each other, 
as two independent physical explanations would do. 

Genuine overdetermination (along with the possibility of competition) exists 
only when one of two conditions is met: (1) the situation-tokens involved in the 
two explanations are mereologically disjoint and causally independent, or (2) 
the situation-types involved in the two explanations are logically unrelated, in 
particular, neither is an instance (at a lower level) of the other.® In the case 
of a functional and an underlying physical explanation, neither of these two 
conditions is met. First, the local physical token is part of the functional token. 
In addition to the local physical token, the functional token includes components 
that support the modal and contextual facts needed to give the physical type 
its functional characteristic. Second, the physical type is an instantiation of 
the functional type. The functional type includes quantification over physical 
types, and the corresponding physical explanans involves a physical type that 
is the relevant instance of the generalization contained in the functional type. 
Hence, the two explanations are too intimately connected to compete with one 
another. 

Mental states can be used in genuine causal explanations insofar as they 
participate in a higher-order functional system. For example, the inferential 
system has as one of its function the detection of beliefs with related logical 
forms that match one of its inference schemata. The contents of logically related 
beliefs can thus figure in genuine causal explanations of the production of new 
beliefs through logical inference. Similarly, beliefs, desires, and intentions can 
enter into causal explanations of the production and revision of intentions. 

Jaegwon Kim (1997b) has argued that higher-order types can never figure 
in genuine type-level causal explanations, because they are gerrymandered or 
unnatural or unprojectible, in the same way as genuinely disjunctive types. I 
have already given an account of genuinely disjunctive types in section 4.8.2. 
According to that account, what makes a type genuinely disjunctive is the sepa- 
rability of any causal constraint involving that type into two or more constraints, 
each involving only one of the disjuncts. If higher-order types, such as functions 


5See also section 4.8.2. 
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or representations, figure in causal constraints that are not separable in this 
way, then we have a principled basis for distinguishing them from the sort of 
genuinely disjunctive types that cannot be causally efficacious. The existence of 
higher-order functions, functions that take as inputs other functions, including 
representations, would support such non-separable causal constraints. Hence, 
since we have good reason, from biology and cognitive psychology, to believe in 
such higher-order functions, we have good reason to believe that at least some 
higher-order types are causally relevant. 


16.4 Two Further Problems 
of Mental Causation 


Jaegwon Kim has very helpfully identified three problems of mental causation: 
the anomalism problem, the problem of syntacticalism, and the explanatory 
exclusion problem (Kim (1991)). In the previous section, I explained how my 
account handles the explanatory exclusion problem. In this section, I will take 
up Kim’s other two problems. 

The anomalism problem arises from Davidson’s thesis that mental properties 
do not obey strict, exceptionless laws. I have argued that exceptionless laws are 
not a necessary condition of causal connection. Indeed, I argued in part I that 
there is a much better fit between causation and non-strict, defeasible laws than 
there is between causation and strict laws (see chapters 4 and 5). 

The problem of syntacticalism is perhaps best thought of as a problem of 
causal locality. Syntactic properties of brain states are intrinsic properties, while 
semantic properties often have a historical and relational component. Only 
intrinsic states can contribute directly to the causation of behavior. Hence, 
mental-content states play no immediate causal role in behavior. 

On my account, mental-content tokens, like other teleofunctional tokens, in- 
clude components that are intrinsic states of the human body and of the relevant 
part of the brain. Hence, they do participate, via these components, in the pro- 
duction of behavior. Moreover, when the behavior is characterized functionally, 
all of the components of mental-state tokens are involved in the causal process, 
since without the modal components, the mental-state tokens would not carry 
the appropriate content and so would not be appropriate inputs to the higher- 
order functional system that includes the functionally characterized behavior. 
For example, if the behavior in question is a token of the action of expressing 
disapproval, then the cause of the behavior will typically include a mental state 
of disapproval (or the intention to mimic disapproval). Without a mental state 
with such a content, the higher-order function of expressing disapproval would 
not be triggered into action. Hence, the semantic content of the mental state is 
causally efficacious. 
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16.5 Qualia 


In experiencing phenomenal qualia, we experience our own mental states as 
representational. Does this mean that we have some independent access to 
the causal facts and historical connections that make these perceptual states 
representational? This is not necessary. In the simplest cases, our perceptual 
states simply are states of awareness of some feature of the environment. 

Some materialists, such as Smart and Armstrong, have compared appercep- 
tion to proprioception, the perception of the internal states of our own body. 
Apperception is sometimes likened to the result of some kind of “brain scan- 
ning” faculty of the brain. This seems to be a mistake. My access to my own 
qualia is more privileged and less fallible than my access to my own bodily 
states, including my own brain states. 

Apperception should not be thought of as perception of perception. In- 
stead, I would suggest that we borrow an idea of Donald Davidson’s, viz., his 
‘paratactic’ theory of indirect quotation (Davidson (1967)). Davidson argues 
that in uttering a sentence like ‘Jones says that it is raining’, we are using, 
not merely mentioning, the sentence ‘it is raining’, and that we then point to 
that very speech-act by means of the pronoun ‘that’ in asserting ‘Jones says 
that’. Whether or not this theory will work for quotation, something analo- 
gous is quite promising as an account of apperception. The act of apperception 
actually incorporates the perception as one of its parts, thereby incorporating 
the representational content of the perception as part of its more complex con- 
tent. An act of apperception is something like a thought with the form ‘I am 
experiencing that’, where the pronoun ‘that’ points to a perceptual state. 

A paratactic theory of apperception can explain the difference between ap- 
perception and the perception of a perception, and so can explain the privileged 
access each person has to his own mental states. When I perceive one of my 
own bodily states, that state has itself no representational content. My percep- 
tion of the bodily state must include an original act of representation, with the 
attendant possibility of misrepresentation or error. Similarly, when I perceive 
that you perceive something, my perception cannot include your perception as 
a part, and so my perception must include a separate representation of the con- 
tent of your perception, again opening up the possibility of error. However, 
when I apperceive one of my own perceptions, the act of perception can become 
incorporated into the apperception, and the content of the apperception can be 
determined directly by the content of the perception. Error is impossible, at 
least at this stage. When I conceptualize my experience, categorizing it as an 
experience of ‘red’, for example, it is possible that I will make an error, just as 
I might misapply ‘red’ to an external object. 

This account of qualia depends on the assumption that all qualia correspond 
to mental representations. This must include the secondary qualia of color, 
taste, odor, and so on. In order for these states to be representational, they 
must represent real qualities of the perceived objects. What property of a 
perceived surface is the quality of color? I would suggest that colors are extrinsic 
teleological properties. Very roughly, a surface has the quality of red just in case it 
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has some physical property with the extrinsic function (qua part of the ecological 
niche of humankind) of stimulating the human visual cortex in a particular way. 
This is a perfectly objective, although admittedly anthropocentric, property of 
the surface. 

I would reject attempts to treat secondary qualities as unreal projections of 
human sensibility. For example, it is a mistake to suppose that something is red 
if and only if it appears red to normal observers in standardized circumstances. 
In addition to the normality of the observer and of the environment, it is also 
necessary to add that the way in which the appearance of the object is caused 
must accord with the proper function of the faculty of color perception. For 
example, suppose that it turned out that a certain iron ore appeared green in 
standard conditions to normal observers, not by virtue of reflecting the right 
sort of light to the observers’ retinas, but by virtue of generating an unusual 
sort of magnetic field that directly stimulates the visual cortex of observers in 
such a way as to make everything look greenish. Such an ore would not really 
be green, since it does not have the proper (extrinsic) function of causing green 
sensations in humans. The causal pathway is functionally deviant. 


16.6 Problem Cases 
16.6.1 The Inverted Spectrum 


Functionalist accounts of consciousness cannot account for what seems to be a 
genuine possibility: individuals whose behavior is indistinguishable from that of 
normal human beings, but who systematically experience things as differently 
colored from the way they are. They use the same words to describe the colors 
of things as normal speakers do, but they experience red things as blue, blue 
things as orange, and so on. No amount of observation of behavior could reveal 
such facts, but they seem to be possible nonetheless. 

On a teleological account of consciousness, such inverted spectrum cases are 
quite possible, and are even discoverable in principle, although not through the 
observation of behavior alone. If a person is in a neurological state with the 
teleofunction of representing things as blue whenever she observes a red object 
in normal circumstances, then there is indeed a mismatch between experience 
and reality, even though this mismatch is entirely covert. 

Ned Block (1990) has produced an interesting variation on the inverted spec- 
trum problem, one intended to demonstrate the falsity of accounts like mine that 
attempt to explain qualia in terms of intentional contents. A normal human 
undergoes a spectrum inversion operation and is simultaneously transported to 
Inverted Earth, a planet on which things similar to things on earth really have 
colors that are inverted relative to their counterparts on earth. On Inverted 
Earth, the sky is yellow, grass is red, etc. To the transportee, everything ap- 
pears normal on Inverted Earth, thanks to his color-inversion operation. He 
notices no change in his internal qualia. However, Block argues, as time pro- 
gresses, the intentional contents of the transportee’s sensory states shift, so that 
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blue experiences carry information about the yellowness of their objects, green 
experiences carry information about the redness of their objects, etc. If experi- 
ential qualia are determined by intentional contents, then we would have to say 
that the subject’s qualia gradually shift their external correlations, without the 
subject’s being able to detect any difference. This seems quite improbable. 

Since I have taken the position in chapter 14 that teleofunctional states are 
narrow states, supervening on the intrinsic character of the human body (to- 
gether with certain facts about modality and objective chance), I deny that 
transportation to Inverted Earth has or could have any effect on the intentional 
contents of the perceptual states. The interesting question for me is this: when 
does a spectrum-inversion operation affect intentional contents and thereby ex- 
periential qualia? 

From my perspective, everything in this case depends on the exact nature 
of the spectrum-inversion operation. If the operation consists in adding a spe- 
cial lens to the eye that transforms the incoming light, then I would guess that 
neither the qualia nor the intentional contents change. In this case, the most 
likely scenario for the coming-to-be of the altered body would be one in which 
ordinary vision (without the added lens) is the normal state, favored by natu- 
ral selection, and the lens, which introduces only unnecessary complication, is 
some sort of adventitious addition. Consequently, it would be ordinary sense 
perception sans lens that would fix the intentional contents of the perceptual 
states. 

In contrast, suppose that the inversion operation involved a complex process 
of rewiring the subject’s rods and cones. If the result is indistinguishable from 
a vision system that might well have arisen directly in nature, then I would ar- 
gue that the operation changes both the intentional contents of and the qualia 
associated with the resulting brain states. This is impossible if we suppose that 
qualia must supervene on the local state of the central nervous system. How- 
ever, I see no reason for making this assumption. The subject (assuming now 
that he is not transported to Inverted Earth) will report an inversion of qualia 
associated with seeing familiar objects. There are, however, two possible ex- 
planations of these reports: (1) the associated qualia have really been inverted 
by the operation, and (2) the subject’s memories of the qualia associated with 
seeing familiar objects in the past have been systematically perverted by the 
operation. In the case of the rewiring of the retina, I would claim that (2) is 
the correct explanation. After the operation, the subject no longer has a human 
visual system. His system is now that of a distinct species. The qualia that he 
experiences when observing familiar objects are now utterly incommensurable 
with human color qualia. He now experiences qualia that are ineffably different 
from any we have experienced, or that he experienced in the past. His mem- 
ories of his own qualia, experienced when he had a human visual system, are 
systematically in error, based on a faulty identification of those old qualia with 
the new qualia he experiences after the operation. 
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16.6.2. Mary the Color Scientist 


Frank Jackson (1996) asks us to imagine Mary, a color scientist who knows 
everything there is to know about the physiology and physics of color and color 
perception, but who, sadly, is congenitally color blind. Jackson argues that there 
is something we know about the color red that Mary does not know: namely, 
how red things look (to a human with normal color vision). This fact is somehow 
extra-physical and extra-causal, since, by hypothesis, Mary knows all relevant 
physical and functional facts. To make this case especially relevant in the present 
context, we can imagine that Mary knows all causal and teleofunctional facts 
as well. How then can it be that redness is simply a teleofunctional property? 

1 think Jackson’s problem is one of the most difficult, but also one of the 
most important, problems for the philosophy of mind. I do not have a fully 
satisfactory solution, but I will sketch out briefly the best account I can find. 
I am inclined to say that although Mary knows all the physical facts of the 
situation, there are certain causal and teleofunctional facts that she, as a color- 
blind person, cannot have access to.® I am inclined to believe that the experience 
of red (or of a specific shade of red) carries information about the world that 
is not representable in the language of physics, not even the language of a 
hypothetical, ideal physics. 

Mary is ignorant, not only of certain psychological facts (how red things 
look), but also of certain facts about colored surfaces (that this chair is red). 
She can learn that the chair is a color that is commonly called ‘red’ (or, more 
precisely, that the chair has a property commonly called ‘the color red’), but 
she cannot learn that the chair is red. Even if it is true, as I am willing to 
grant, that color properties loosely supervene on physical properties, so the 
chair could not be a different color without differing in some relevant physical 
properties, it does not follow that exactly the same information carried by the 
state of perceiving the redness of the chair can be carried by some sentence of 
physics (even of the ideal physics). The boundaries of the region in physical 
state-space upon which the shade of red supervenes is very probably fractal 
(infinitely complex). In addition, the indeterminacies and aspects of vagueness 
associated with the color red almost certainly have no exact isomorph among 
the concepts of physics. 

One thing that makes the Mary case so intractable is the Cartesian error of 
locating the secondary qualities entirely in the mind. If Mary were omniscient 
about the extra-mental objects of perception, there is no way to limit her knowl- 
edge of the contents of mental states. However, if we can look at physics, not as 
a potentially exhaustive treasury of truths about the mental world, but rather 
as the product of a ruthless degree of abstraction and idealization, resulting in a 
very narrow but extremely useful mode of description, then we can understand 
why the mental cannot be reduced to the physical. 

What bars Mary from access to the properties of red and of being appeared 
to redly is the existence of a certain kind of epistemic circle (analogous to the 


®This is exactly the position that Gilbert Harman (1990) takes. 


210 Realism Regained 


problem of the “hermeneutic circle”). Red can be identified with the teleo- 
functional state whose extrinsic proper function is to produce (via the normal 
operation of the visual system) the state of being-appeared-to-redly in humans. 
The state of being-appeared-to-redly can be identified as the teleofunctional 
state in humans whose intrinsic proper function is to convey information about 
the contemporary presence of a red object in a certain place. The two functions 
are constitutionally intertwined: each exists for the sake of the other.” There 
is no real mystery about how this came to be: color perception enables us to 
classify and track physical objects more effectively, and shared color perception 
enables us to use colored objects and color terms to enhance communication (a 
red sign means Stop!). Nonetheless, the tightness of the circularity bars those 
lacking normal human color perception any direct cognitive access to the two 
properties, and thus, to any facts including either one. 

In the case of colors, the microphysical structures that underlie the sensi- 
ble qualities are of no interest to us, except insofar as they stably and reliably 
support the same colors as context and perceiver are varied. Color perceptions 
convey but do not represent information about the reflectances of different wave- 
lengths of light, since it is not part of the function of color perception to convey 
this information. It is not part of the function of color perception to carry in- 
formation about wavelengths, but only to carry information about how surfaces 
normally appear to human observers. Color perception and the perception of 
auditory qualities are perhaps distinctive in this respect: our perception of pri- 
mary qualities, and also our perception of many smells and tastes, do often have 
the function of carrying information about the underlying physical and chem- 
ical properties of the perceived object. For example, the function of the taste 
of saltiness is to indicate the presence of NaCl and similar compounds. We can 
distinguish two components of the representational content of sensory qualia: 
(1) the intersubjective component, and (2) the purely physical component. The 
content of every form of qualia contains an intersubjective component. Some 
forms of qualia have a content with a purely physical component, while others, 
such as color, do not. In any case, it is knowledge of the intersubjective com- 
ponent that depends (for us, at least) on actual experience of the sensations in 
question. No quantity of information about the physical component of the con- 
tent, or the physical properties upon which the quality supervenes, can provide 
knowledge of this intersubjective component. 

The circularity of secondary qualities and their corresponding phenomenal 
appearances poses a problem at the semantic or ontological level. If redness 
and the appearance of redness are complementary, interdependent functions, 
there would seem to be a problem about fixing the extension of words like ‘red’ 
in natural language. If ‘red’ is defined as referring to the extrinsic function 
of producing reddish appearances, and ‘reddish appearance’ is defined as the 
intrinsic function of a mental state to carry information about the location of a 


’To put the matter formally, the property of redness can be identified with the extrin- 
sic teleological type SY(Y & (rex¢(Y, human, appeared-to-redly)), and the property of being- 
appeared-to-redly with the intrinsic teleological type SY(Y & (7in¢(Y, human, (Y => p red)). 
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red surface, then we are faced with the puzzle of how to understand the resulting 
circularity. 

In a recent book on the Liar paradox, Gupta and Belnap (1993) develop a 
theory that vindicates the legitimacy and usefulness of circular definitions of 
exactly this kind. The circular definitions can be thought of as revision rules, 
rules for revising initial guesses about the correct extensions of the defined 
predicates. A defensible interpretation of the predicates consists of a fixed point 
in this revision process. 

In the case of sensory qualities, the definitions of all of the quality/ phenom- 
ena pairs take exactly the same form. Different qualities correspond to different 
fixed points in the Gupta/Belnap revision procedure. Thus, the circularity in 
conception is not semantically vicious. 

Materialist responses to the Mary problem have all involved introducing 
some sort of distinction between real, coarse-grained facts and merely concep- 
tual, fine-grained facts (Tye (1995)), or between facts and “phenomenal infor- 
mation” (Lycan (1996)). Mary is supposed to know all the real facts about 
redness and red experience, but lack some sort of phenomenal or conceptual 
information. Lycan uses the analogy of de se knowledge: I can know that Rob 
Koons is overpaid without knowing that I am overpaid, if I have forgotten my 
identity. I possess all the facts, but I am still missing some potentially useful 
information, namely, that I am Rob Koons. 

It is hard to see how this sort of materialist response can possibly succeed. 
If Mary is lacking in conceptual or phenomenal information, then there is some 
fact of which she is ignorant, namely, the fact that a certain phenomenal con- 
cept (namely, red) applies to certain actual experiences and is associated with 
a certain English word. She knows that these experiences under their neuro- 
physiological descriptions, and she knows that there is a phenomenal concept 
corresponding to the word ‘red’, but she cannot know de re of the phenomenal 
concept red, that it applies to these experiences, since she can have no de re 
attitudes involving the phenomenal concept at all. 

The central issue here concerns the nature of facts. I would propose that 
we think of facts as the combination of a situation-token and a situation-type, 
where the token is actual and supports the type. To know a fact, one must be 
able to represent it. To represent a fact, one must be able to represent both the 
token (via, perhaps, some relation between the token and one’s representation- 
token) and the type. Mary cannot have mental states that directly represent the 
type red. She can represent this type indirectly, by means of a representation 
equivalent in content to the definite description the surface-property designated 
in English by the word ‘red’ or to similar descriptions. But this does not give 
her access to the fact that the chair is red —- only to the (distinct) fact that the 
chair has the surface-property designated in English by the word ‘red’. 


16.6.3 Killer Yellow and Magnetic Green 


A killer yellow object is one which is in fact yellow in color, but which cannot be 
perceived by human beings because it emits such powerful and lethal radiation 
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that any human would be vaporized long before he or she got close enough to 
the object to observe its color. The possibility of killer yellow objects, whose 
postulation is attributed to Saul Kripke, raises serious problems for accounts 
that analyze colors as dispositions to cause perceptions of certain kinds in normal 
observers under normal circumstances. Killer yellow objects are yellow even 
though they have no disposition to cause yellow sensations in normal humans 
under normal circumstances — they have instead the disposition to vaporize 
such normal observers under such circumstances. 

Killer yellow objects are yellow because they have some physical property 
that has (when realized by other, less lethal objects) the teleofunction of in- 
teracting with the human visual system in a certain way to produce yellow 
sensations. Yellowness is inherited by an object because of its possession of a 
higher-order type, not because the object itself would or even could fulfill the 
corresponding function. 

A magnetic green object is a colorless object that causes green sensations 
in normal observers under normal circumstances, but does so by means of a 
powerful magnetic field that induces visual hallucinations. To make the case 
especially devious, suppose that the magnetic field always produces a greenish 
hallucination in exactly the part of the visual field of the subject that would 
be occupied by the magnetic green object. Magnetic green objects also pose 
a challenge to dispositional accounts of color, since they are not green, despite 
their possessing the right disposition. Magnetic green objects are not green 
because of the deviancy of the causal chain leading to the green sensation. 
They subvert and do not fulfill the human capacity for color sensation. 


16.6.4 Zombies 


Is it possible that an organism could be physically identical to a human being 
and yet be a zombie, experiencing no qualia whatsoever? The answer to this 
question depends on what is involved in being physically identical. Suppose that 
there is a possible world w in which all of the physical properties of the actual 
world exist, but in which the modal and stochastic facts, and the associated 
causal laws, are radically different. In world w, there could exist a physical 
duplicate of me whose physical states are entirely lacking the higher-order, tele- 
ofunctional properties that my actual physical states possess. It might be that 
some such physical duplicate would be entirely lacking in teleofunctional prop- 
erties. If so, the duplicate would lack all mental properties, including those 
of experiencing qualia. Of course, the behavioral dispositions of the duplicate 
would be radically different, as well. 

It is this sort of conceivability of physical-duplicate zombies that explains 
why there is an explanatory gap between the properties of physics and those of 
consciousness. It is not so clear, however, that there is any such gap between 
teleofunctional properties and the properties of consciousness. 
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16.7 The Correlation of Qualia and Physiology 


Qualia, such as being appeared to greenly, are regularly associated with cer- 
tain physiological states. What accounts for this reliable correlation? If some 
property-identity thesis were true, an answer would be readily available: the 
qualia properties are identical to the neurophysiological properties in question. 
However, the example of Mary the color scientist demonstrates that no such 
identity thesis is correct. This means we have a substantial problem of explain- 
ing the regular association of pairs of quite different properties. 

For dualists, this correlation must be accepted either as a brute fact, or as the 
product of divine fiat (see Adams (1987), Swinburne (1979)). For materialists, 
only the brute fact option is available. 

However, on the teleological account of qualia, a non-trivial explanation 
(without resort to divine intervention) is readily available. The state of being- 
appeared-to-greenly is supposed to carry robustly the information that a green 
object is present and visible. An investigation of human neurophysiology can 
explain how it is that the presence of green objects, in normal circumstances and 
with respect to normal human observers, causes the particular neurophysiologi- 
cal state regularly associated with the appearance of greenness. The availability 
of this explanation depends on two facts: (1) that the very essence of the prop- 
erty of being-appeared-to-greenly essentially includes a certain content, namely, 
the visible presence of a green object, and (2) that the corresponding external 
property (greenness) really exists. If we acknowledge these two facts, then there 
is no explanatory gap between the neurophysiological property and the qualia, 
despite their distinctness. 


16.8 Free Will 


The will is a faculty whose function is to make apt choices, choices that further 
the agent’s good. A “free” will is a will that is not disabled at the point of action, 
a will that really does select one from a list of many options. An perfectly unfree 
will is a will that, at the point of action, is disabled. In cases of unfreedom, no 
choice is made: the agent was capable of considering and acting out only one 
course of action. 

Freedom of will is limited when the the will is unable to adopt courses 
of action that it should (ideally) be able to adopt. Being unable to think the 
unthinkable is not a case of unfreedom. If I am unable to consider the possibility 
of abandoning my child, this is not a case of unfreedom, since this is not the 
sort of option that my will is supposed to be able to consider (i.e., that it must 
consider if all of my primary teleofunctions are to be fulfilled). One who is able 
to consider unthinkable options has a will that is not thereby freer, but merely 
more licentious. 

As Aristotle noted, freedom or voluntariness is neither sufficient nor nec- 
essary for responsibility. One can be responsible for an unfree act if one was 
responsible for causing the unfreedom. One can fail to be responsible for a free 
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act if one was, for no fault of one’s own, ignorant of the nature or consequences 
of the act freely taken. 

In part I, I gave several reasons for rejecting determinism in the strict sense, 
according to which every state has a strictly necessitating cause. In the end, 
determinism in this sense is incompatible, not only with free will, but with the 
central features of causation itself. However, there is a more modest version of 
determinism that seems quite coherent: one that combines an indeterministic 
conception of causation with the theses of the soundness and completeness of 
causal explanation. Even though no cause strictly necessitates its effects, it 
might still be the case that for every wholly contingent state there is an actual 
causal explanation of that state that is adequate and undefeated. This is the 
hypothesis that I called the Completeness of Explanation in section 5.4. 

If we combine the necessary completeness of explanation with the indeter- 
ministic model of causation, we end up with a mitigated form of determinism. 
Causal explanation was defined in part I (section 5.4.2) as a prior state that 
is both defeasibly sufficient for the explanandum and actually undefeated. The 
explanans need not necessitate the explanandum, since there could have existed 
defeaters of the explanans. However, if we add to the explanans the negative in- 
formation that no such defeaters exist, the resulting situation would, given the 
necessity of explanatory completeness and the existence of some effect of the 
explanans, necessitate the existence of the explanandum. Thus, there would be 
only two difference between strict and mitigated determinism: 


e Strict determinism implies that the continuation of the course of the world 
is itself necessary, whereas mitigated determinism implies only that if the 
course of the world continues, it must continue in a unique way. 


e Strict determinism implies that effects are necessitated by their causes, 
whereas mitigated determinism implies that the effects are necessitated 
by the sum of their causes plus a background situation rich enough to 
exclude the existence of any possible defeaters of the cause. 


These differences do not seem to be great enough to secure the kind of 
openness of the future, the real possibility of alternative courses of action, that 
genuine free will seems to require. 

One possible alternative would be to suppose that the hypothesis of the 
completeness of explanation is only contingently true, but then it is hard to 
see how we could have any basis for confidence in its truth in the actual world. 
However, there is a third alternative. It would be quite reasonable to take the 
completeness of explanation in any given case to be a truth with a very high 
objective probability, perhaps infinitely close to 1. If we embraced such a modest 
determinism, which we might call default determinism, we could still confidently 
expect to find causal explanations of every fact, and still be able to affirm that 
many things could have gone otherwise than they did. 

Default determinism is thus fully compatible with the real possibility that 
things could have gone otherwise, even without any change in causally an- 
tecedent facts. This real possibility of alternative courses is important, not 
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only because without out it we feel a reluctance to assign moral responsibility, 
but also for a more fundamental reason. The veridicality of our deliberations 
entails that the various options we represent to ourselves are really possible in 
our actual circumstances. For example, in causal decision theory (Gibbard and 
Harper (1981), Lewis (1981)), we evaluate a decision situation in three steps: 
first, discover which actions are possible in the present circumstances; second, 
discover the probable consequences of each possible action; and finally, assign 
a utility value to each of the possible outcomes, evaluating a possible action by 
means of the probability-weighted average of the utility of its possible results. 
If necessitarian determinism were true, then all representations of the possibil- 
ity of options other than the one actually taken would be illusory. This would 
mean that all deliberation was shot through with error, making the evaluation 
of decisions as optimal or sub-optimal impossible. Since all deliberation aims 
at action that is objectively optimal in the circumstances, necessitarian deter- 
minism would entail that all deliberation is aiming at a will-o’-the-wisp. In 
addition, without real possibilities that are alternative to the actual future, it 
would be impossible for our representations of alternatives to have the function 
of carrying robust information about the existence of such alternative futures. 
This would mean that we could not give a realist semantics for our apparent 
beliefs about what is still possible. 

Determinists might object that all we need is the doxastic or epistemic possi- 
bility of alternative futures: all that is needed is that the deliberator be ignorant 
of which of several alternative futures is already determined to be realized. How- 
ever, this seriously misrepresents the nature of deliberation. In deliberation, we 
are not interested in finding out only what alternatives are possible, for all we 
presently know. We are interested in discovering which alternatives are really 
possible. We actively seek information that would improve and correct our cur- 
rent beliefs about the range of available options. The determinist cannot explain 
our interest in gaining new modal information. 

In addition, ignorance about the future course of things is not a necessary 
condition for deliberation. Suppose that I have already made up my mind to 
take the train tomorrow, confident that this is the optimal choice. You challenge 
my reasoning, arguing that I have made a mistake, and that the truly optimal 
choice for me is to take an airplane instead. I can refute you by engaging in 
a process of re-deliberation, and I can do so without in the slightest degree 
reducing my confidence that I will in fact take the train. What is important is 
not that there is any practical doubt in my mind about whether I will take the 
train, but that it is a genuine ontic possibility that I do otherwise. This I can 
grant, while admitting only an infinitesimal degree of doubt about the actual 
course of action I shall take. 

Suppose the determinist concedes that deliberation requires the acknowledg- 
ment of the genuine possibility of several alternative futures but continues to 
insist that the laws of nature and the history of the world so far pre-determine 
a unique course. He could do so by postulating that it is a contingent matter 
whether the laws of nature will continue to hold. This defense of determinism 
would have the bizarre consequence that I would have to take into account, 
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in my deliberations, possible futures in which the laws of nature are violated. 
Surely, once I know something is a law of nature, it would be irrational for me 
to consider its violation a real possibility in the future. 


17 


Teleological Reliabilism 


17.1 Reliabilism: The Reference Class 
Problem 


Gettier’s seminal article on justified true belief (Gettier (1963)) forced a re- 
thinking of the conditions of knowledge.! So-called external factors and analyses 
came to the fore. The distinction between knowledge and mere true opinion 
turns on the way in which the belief was formed and how it is sustained. If 
the formation and maintenance of the belief reliably tracks the truth, then 
the belief constitutes knowledge. Knowledge is reliably formed and maintained 
belief (Goldman (1979)). 

Reliability is a matter of probability. A way of forming beliefs is reliable if 
the objective probability of a belief’s being true, given that it is a product of 
that way, is very high. This means that everything turns on how we specify the 
various ‘ways’ of forming beliefs. 

Without any principled answer to this question, issues of reliability can be 
settled any way we please by simple jury-rigging the definition of the ‘way’ 
in which the belief was formed. Suppose I believe that it will rain tomorrow 
because my friend said so. Which of the following is the ‘way’ in which this 
belief was formed? 


e By testimony 


1Gettier demonstrated that justified true belief is not sufficient for knowledge. He gives 
the example of Smith, who is justified in believing Jones owns a Ford. Smith uses the laws of 
logic to derive the belief Jones owns a Ford or Brown is in Barcelona, having no idea where 
Brown in fact is. As it happens, Smith’s belief about Jones is false, since Jones recently sold 
his Ford but lied to Smith about it, but Smith’s belief in the disjunction is true, since, by 
coincidence, Brown happens to be in Barcelona. Smith’s belief in the disjunction is true (by 
virtue of the truth of the first disjunct) and justified (by virtue of his justified belief in the 
first disjunct), but this justified true belief does not constitute knowledge. What is missing 
is the right sort of connection between the truth-maker of the disjunction (Brown’s location) 
and Smith’s belief-state. 
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e By testimony from a friend 

e By testimony from my friend 

e By testimony from my friend or the Encyclopedia Britannica 
e By testimony that happens in this case to be true 


One could get very different answers to the question of probability, from 
nearly 0 to 1, depending on one’s choice of the reference class. Is there any 
principled way to choose a (or the) correct class? 

Teleology to the rescue! As Alvin Plantinga (1993) has convincingly argued, 
what makes for knowledge is not merely reliability, but the fulfillment of the 
function of reliability. Every belief is formed by a combination of a number of 
neural states with the intrinsic function of carrying reliable information, and a 
number of environmental factors with the extrinsic function of conveying reliable 
information to us. When all of these functions are fulfilled, the resulting belief 
is a state of knowledge. When malfunction occurs, the belief is a mere opinion, 
true if it happens to coincide in content with what would have been believed 
had there been no malfunction, false otherwise. 

To return to the case of testimony, malfunction could occur at a number 
of points. There might already have been malfunction in my friend’s belief 
formation. My friend might be lying, which would be a malfunction in the 
extrinsic function of my friend’s speech as part of my environment. I might have 
misunderstood what my friend said. Or, there might be some other malfunction 
in my processing of my friend’s statement. For example, I might wrongly believe 
both that he’s lying and that he’s always wrong about the weather. A great 
many things can be involved in the ‘way’ in which I form my belief, but not 
just anything. A factor can be introduced as relevant to the classification of my 
belief as knowledge only if there is some teleological connection between that 
factor and my belief. 


17.2 Grue, Bleen, and the New Riddle 
of Induction 


As Plantinga has argued, a teleological epistemology has a fairly simple answer 
to the Humean puzzle about induction. Induction is reasonable because induc- 
tion accords with the teleofunction of certain belief-forming processes in the 
human mind. Reason is not to be identified with deductive logic alone: rea- 
son is itself an inescapably teleological concept. We think reasonably when we 
think in accordance with the proper, intrinsic functions of our mind. No further 
justification of reason is needed or possible. 

Nonetheless, we are still left with the substantive task of characterizing the 
inductive functions of the mind. Nelson Goodman’s new riddle of induction 
shows us that the definition of induction is no trivial task. We cannot simply 
say that induction consists in inferring that unobserved tokens will resemble 
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observed ones. We must say something substantive about what sorts of resem- 
blances are, and what sorts are not, reasonable to project in induction. 
For example, if we define the property of “grue” as follows: 


VaVt(Grue(z, t) + [(Obs(x) & Green(z, t)) V (sObs(z) & Blue(z, t))]) 


where “Obs(a:)” means that object «x will have been observed at least once before 
the year 2001, then we will find that all heretofore-observed green things have 
also been grue. In particular, all emeralds so far observed have been grue. If 
we assume the future will be like the past, should we predict that emeralds first 
discovered after the year 2001 will be green or grue? For emeralds discovered 
after 2000, these two predictions are incompatible, since an emerald that is grue 
and unobserved until 2001 must, according to the definition, be blue. 

There is a simple answer that teleological epistemology can give to this 
question as well (in fact, Plantinga (Plantinga, 1993, pp. 133-136) offers exactly 
this answer). We can say that for a mind to use a property like grue in induction 
is to malfunction. However, in this case, I am not content with stopping with 
this simple answer, since we would like to know what exactly constitutes the 
malfunction in the case of properties like grue. 

The correct diagnosis depends on the fact that the property of grue is dis- 
junctive in a way that the property green is not. If we grant this assumption for 
a moment, the problem with the grue inference is that we are using a disjunctive 
property in our projection, despite the fact that all of the observed instances fall 
under only one of the two disjuncts. In the absence of any knowledge linking 
the two disjuncts, this is an unreliable procedure, and hence one that we can 
imagine that natural selection has disfavored. 

What if we drop the troublesome disjunct? Why is it unreasonable to infer 
that all the emeralds in the world will have been observed before the year 2001, 
given that all observed emeralds have been so observed? In this case, it is the 
conjunctive nature of the property that is the source of the problem, since one 
of the conjuncts is illegitimately egocentric and time-bound. Once again, it is 
easy to see how allowing such properties to occur in inductive procedures would 
lead to unreliable results. 

However, is it so clear that grue is a disjunctive property, in fact, a dis- 
junction of two conjunctive properties? Doesn’t this involve a kind of category 
mistake? Isn’t it predicates or phrases, and not properties, that can be disjunc- 
tive or conjunctive? If we concede that this is so, we are immediately in trouble, 
since we can imagine a language in which ‘grue’ is the primitive and ‘green’ is 
defined in terms of ‘grue’ and ‘bleen’. 

There is, however, a transcendental argument that demonstrates that the 
kind of nominalism, like Goodman’s, that denies the distinction between dis- 
junctive and simple properties is incoherent. Every proposed solution to the 
grue puzzle makes a covert appeal to a Platonic distinction between simple and 
complex properties. If we distinguish between green and grue on phenomenolog- 
ical or epistemological or historical-cultural grounds, we ignore the fact that the 
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categories of phenomenology, epistemology, and history can also be subjected 
to grue-like transformations. 

For example, suppose we try to distinguish between green and grue on the 
grounds that there is a phenomenological quality that corresponds to the first 
and not the second. This ignores the fact that there is a phenomenological 
quality which corresponds to grue, the property of ‘being appeared to gruely’. 
One is appeared to gruely if and only if one is either being appeared to greenly 
not as a result of observing an emerald first observed (if at all) after 2000, 
or one is being appeared to bluely as a result of observing an emerald first 
observed after 2000. If one objects that being appeared to gruely is not a 
genuine phenomenological property, because, perhaps, it is not introspectible, 
I can simply respond by asking, Upon what basis do you say that it is being 
appeared to greenly, and not being appeared to gruely, which we are able to 
introspect? Every introspective act in response to an event in which the one 
property was instantiated was also an act in response to an event in which 
the other was instantiated. Why does the epistemologist identify the content 
of these introspection with the quality of green instead of grue? It must have 
to do with the fact that being appeared to greenly is an intrinsically simpler 
property than being appeared to gruely. 

Even Goodman’s solution in terms of entrenchment is vulnerable to this 
attack. On what basis can we say that it is green rather than grue which 
has been entrenched by our past practice? Since the two properties have been 
so far coextensive, it would be equally charitable to interpret the established 
English word ‘green’ as signifying grue or as signifying green. In order to abort 
this infinite regress, one must appeal at some point to a difference in intrinsic 
simplicity between the two properties. 

This solution does not depend on what we might call “strong” Platonism 
or Aristotelian essentialism: the view that world contains ready-made, pre-cut 
categories to which we must simply attach labels. Instead, what is necessary 
is that there be a mind-independent abstraction-space, in relation to which we 
can distinguish those possible categories that are convex and those that are not. 
The category ‘green’ corresponds to a single convex region in this abstraction 
space (a single “regularity” to which agents might be “attuned”), whereas ‘grue’ 
corresponds to two disparate regions.” 


17.3. Curve-Fitting: The Problem of 
Mathematical Simplicity 


It has been known from antiquity that the simplest explanation is the best. This 
is often encapsulated as Occam’s razor: do not multiply entities needlessly. In 
appendix B, I will argue that the crucial thing that we must minimize in our 
explanations is the extension of the causal priority/relevance relation. It is not 


2See also section 5.8.1 for a further discussion of the causal irrelevance of merely disjunctive 
properties. 
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that we should suppose that only those particulars exist that are needed in our 
causal explanations, but rather that only those particulars are relevant to a 
particular explanandum that are needed in explaining it. 

This minimization of causal relevance extends to the mathematical domain 
as well. We should suppose that a mathematical structure is causally relevant 
to a particular phenomenon only if it is needed in constructing an adequate 
explanation of the phenomenon. This means that we should not use integers if 
the natural numbers will do, or real numbers if the integers will do. We should 
not use multiplication if addition alone is needed, or exponentiation if addition 
and multiplication are adequate. 

It is not that we are supposing that the reals do not exist if they are not 
needed in formulating a particular explanation. Rather, it is that we should 
suppose that the reals are causally irrelevant to the phenomenon unless they 
are needed. Since in my view, mathematical objects can be causally connected 
to physical events through the medium of modality, the very same Occamist 
considerations that lead us to prefer the physically simplest theory should lead us 
to prefer the mathematically simplest theory. If a linear relationship is adequate 
to explain the data points, then we should infer that the causal structure of 
the world in this domain is linear in character, despite the fact that there are 
infinitely many curves passing through those same data points. 


17.4 The Reliability of Simplicity 
as a Criterion of Truth 


Inference to the best explanation involves an application of Occam’s razor, since 
the best explanation is the simplest explanation, the one that assumes the small- 
est set of causally relevant factors. Presumably, Occam’s razor is a principle 
governing the proper functioning of the human mind. However, even if this 
is true, we still face the difficulty of meeting Hume’s challenge: how do our 
super-empirical concepts acquire the power to designate what they do? I must 
meet this challenge in essentially the same way as that in which I explained the 
representative character or perceptual ideas. A non-perceptual or theoretical 
mental state x carries the representational content that a token is of type ¢ just 
in case it is the teleofunction of state x to carry the information that a token is 
of type ¢. 

In order for the data to carry the information that the simple explanation 
is veridical, it must be the case that the objective probability of the simple 
explanation, conditional on the data, is very high (in fact, infinitely close to 
one), We can ask, under what conditions is this the case? Two things must be 
true: 


1. There must be a significant (finite) probability that any phenomenon we 
encounter is in fact caused by a relatively simple ensemble of relevant 
factors. 
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2. The probability that a set of data that is in fact caused by an extremely 
large and complex causal mechanism should be amenable to a simple 
would-be explanation must be extremely low (infinitesimal). 


In any world that is only finitely complex, there will be a quite large set of 
phenomena that are caused by a.small number of factors. However, what is cru- 
cial is that we should encounter a significant number of such phenomena at the 
normal scale of human/environment interaction. Since human beings are them- 
selves composed of a large number of complex parts, relatively little of what we 
encounter is actually simple in the absolute sense. There is a qualified sense of 
simplicity, however, that does characterize many commonly encountered phe- 
nomena. There are many systems that, although they are composed of a very 
large number of parts, can be analyzed as composed of a few intermediate-scale 
aggregates, each of which can be treated as causal units (at least, with approx- 
imate success and over a fairly wide range of conditions). Thus, the possibility 
of theoretical representations (and, therefore, the possibility of scientific knowl- 
edge) depends on the contingent fact that the world we inhabit is uniformly 
inhabited by systems characterized by such intermediate-scale simplicity. This 
contingent uniformity must itself have an ultimate causal explanation, and it is 
this causal foundation that undergirds and informs human cognition. 

Condition (2) consists in the requirement that the probability of pseudo- 
simplicity is infinitesimal. A phenomenon instantiates pseudo-simplicity when 
it is caused by a complex set of factors, there is no intermediate scale at which 
these factors can, even with approximate success, be aggregated into a small 
number of units, and yet the phenomenon is amenable to a simple putative 
explanation. If the phenomenon consists of an infinite collection of data, then 
the probability of pseudo-simplicity drops to an infinitesimal level, unless many 
causal factors involved are themselves somehow coordinated so as to mimic the 
simpler mechanism. The absence of such a mimicry mechanism (e.g., Descartes’s 
evil genius) is an important precondition, not only of our theoretical knowledge, 
but also of theoretical cognition of any kind. 


17.5 The Incompatibility of Materialism 
and Scientific Realism 


Whenever philosophers bother to offer a defense for philosophical materialism, 
they typically appeal to the authority of natural science. Science is supposed 
to provide us with a picture of the world so much more reliable and well sup- 
ported than that provided by any non-scientific source of information that we 
are entitled, perhaps even obliged, to withhold belief in anything that is not an 
intrinsic part of our our best scientific picture of the world. This scientism is 
taken to support materialism, since, at present, our best scientific picture of the 
world is an essentially materialistic one, with no reference to causal agencies 
other than those that can be located within space and time. 
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This defense of materialism or “naturalism” presupposes a version of scien- 
tific realism: unless science provides us with objective truth about reality, it 
has no authority to dictate to us the form that our philosophical ontology and 
metaphysics must take. Science construed as a mere instrument for manipulat- 
ing experience, or merely as an autonomous construction of our society, without 
reference to our reality, tells us nothing about what kinds of things really exist 
and act. 

In this section, I will argue, somewhat paradoxically, that scientific realism 
can provide no support to philosophical materialism. In fact, the situation is 
precisely the reverse: materialism and scientific realism are incompatible. 

Specifically, I will argue that (in the presence of certain well-established facts 
about scientific practice) the following three theses are mutually inconsistent: 


1. Scientific realism 

2. Materialism (ontological naturalism, the thesis that the world of space and 
time is causally closed) 

3. Representational naturalism (the thesis that there exists a correct naturalistic 
account of knowledge and intentionality) 


By scientific realism, I intend a thesis that includes both a semantic and an 
epistemological component. Roughly speaking, scientific realism is the conjunc- 
tion of the following two claims: 


1. Our scientific theories and models are theories and models of the real world, 
including its laws, as they exist objectively, independent of our preferences and 
practices. 

2. Scientific methods tend, in the long run, to increase our stock of real knowl- 
edge. 


Ontological naturalism is the thesis that nothing can have any influence on 
events and conditions in space and time except other events and conditions in 
space and time. According to the ontological naturalist, there are no causal 
influences from things “outside” space: either there are no such things, or they 
have nothing to do with us and our world. 

Representational naturalism is the proposition that human knowledge and 
intentionality are parts of nature, to be explained entirely in terms of scientif- 
ically understandable causal connections between brain states and the world. 
Intentionality is that feature of our thoughts and words that makes them about 
things, that gives them the capability of being true or false of the world. 

I take philosophical naturalism to be the conjunction of ontological and 
representational naturalism. The two theses are logically independent: it is 
possible to be an ontological naturalist without being a representational natu- 
ralist, and vice versa. For example, eliminativists like the Churchlands, Stich, 
and (possibly) Dennett are ontological naturalists who avoid being representa- 
tional naturalists by failing to accept the reality of knowledge and intentionality. 
Conversely, a Platonist might accept that knowledge and intentionality are to 
be understood entirely in terms of causal relations, including, perhaps, causal 
connections to the Forms, without being an ontological naturalist. I will argue 
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that it is only the conjunction of the two naturalistic theses that is incompatible 
with scientific realism. 

Many philosophers believe that scientific realism gives us good reason to be- 
lieve both ontological naturalism and representational naturalism. I will argue, 
paradoxically, that scientific realism entails that either ontological naturalism 
or representational (or both) is false. I will argue that nature is comprehensible 
scientifically only if nature is not a causally closed system -—~ only if nature is 
shaped by supernatural forces (forces beyond the scope of physical space and 
time). 

My argument requires two critical assumptions: 


PS: A preference for simplicity (elegance, symmetries, invariances) is a pervasive 
feature of scientific practice. 


ER: Reliability is an essential component of knowledge and intentionality, on 
any naturalistic account of these. 


17.5.1 The Pervasiveness of Simplicity 


Philosophers and historians of science have long recognized that quasi-aesthetic 
considerations, such as simplicity, symmetry, and elegance, have played a per- 
vasive and indispensable role in theory choice. For instance, Copernicus’s he- 
liocentric model replaced the Ptolemaic system long before it had achieved a 
better fit with the data because of its far greater simplicity. Similarly, New- 
ton’s and Einstein’s theories of gravitation won early acceptance due to their 
extraordinary degree of symmetry and elegance. 

In his recent book Dreams of a Final Theory, the physicist Steven Wein- 
berg included a chapter entitled “Beautiful Theories,” in which he detailed the 
indispensable role of simplicity in the recent history of physics. According to 
Weinberg, physicists use aesthetic qualities both as a way of suggesting theories 
and, even more importantly, as a sine qua non of viable theories. Weinberg 
argues that this developing sense of the aesthetics of nature has proved to be a 
reliable indicator of theoretical truth. 


The physicist’s sense of beauty is ... supposed to serve a purpose —— 
it is supposed to help the physicist select ideas that help us explain 
nature. (Weinberg, 1993, p. 133) 


...we demand a simplicity and rigidity in our principles before we 
are willing to to take them seriously. (Weinberg, 1993, pp. 148-149) 


For example, Weinberg points out that general relativity is attractive not 
just for its symmetry, but for the fact that the symmetry between different 
frames of reference requires the existence of gravitation. The symmetry built 
into Einstein’s theory is so powerful and exacting that concrete physical conse- 
quences, such as the inverse square law of gravity, follow inexorably. Similarly, 
Weinberg explains that the electroweak theory is grounded in an internal sym- 
metry between the roles of electrons and neutrinos. 
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The simplicity that physicists discover in nature plays a critical heuristic 
role in the discovery of new laws. As Weinberg explains, 


Weirdly, although the beauty of physical theories is embodied in 
rigid, mathematical structures based on simple underlying princi- 
ples, the structures that have this sort of beauty tend to survive 
even when the underlying principles are found to be wrong.... We 
are led to beautiful structures by physical principles, but the beauty 
sometimes survives when the principles themselves do not. (Wein- 
berg, 1993, pp. 151-152) 


Weinberg notes that the simplicity that plays this central role in theoretical 
physics is “not the mechanical sort that can be measured by counting equations 
or symbols” (Weinberg, 1993, p. 134). The recognition of this form of beauty 
requires an act of quasi-aesthetic judgment. As Weinberg observes, 


There is no logical formula that establishes a sharp dividing line 
between a beautiful explanatory theory and a mere list of data, but 
we know the difference when we see it. 


In claiming that an aesthetic form of simplicity plays a pervasive and indis- 
pensable role in scientific theory choice, I am not claiming that the aesthetic 
sense involved is innate or a priori. I am inclined to agree with Weinberg in 
thinking that “the universe acts as a random, inefficient, and in the long-run 
effective teaching machine” (Weinberg, 1993, p. 158). We have become attuned 
to the aesthetic deep structure of the universe by a long process of trial and 
error, a kind of natural selection of aesthetic judgments. As Weinberg puts it, 


Through countless false starts, we have gotten it beaten into us that 
nature is a certain way, and we have grown to look at that way that 
nature is as beautiful ... Evidently we have been changed by the 
universe acting as a teaching machine and imposing on us a sense of 
beauty with which our species was not born. Even mathematicians 
live in the real universe, and respond to its lessons. (Weinberg, 1993, 
pp. 158-159) 


Nonetheless, even though we have no reason to think that the origin of our 
aesthetic attunement to the structure of the universe is mysteriously prior to 
experience, there remains the fact that experience has attuned us to something, 
and this something runs throughout the most fundamental laws of nature. Be- 
hind the blurrin’ and buzzin’ confusion of data, we have discovered a consistent 
aesthetic behind the various fundamental laws. As Weinberg concludes, 


It is when we study truly fundamental problems that we expect to 
find beautiful answers. We believe that, if we ask why the world is 
the way it is and then ask why that answer is the way it is, at the 
end of this chain of explanations we shall find a few simple principles 
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of compelling beauty. We think this in part because our historical 
experience teaches us that as we look beneath the surface of things, 
we find more and more beauty. Plato and the neo-Platonists taught 
that the beauty we see in nature is a reflection of the beauty of the 
ultimate, the nous. For us, too, the beauty of present theories is an 
anticipation, a premonition, of the beauty of the final theory. And, 
in any case, we would not accept any theory as final unless it were 
beautiful. (Weinberg, 1993, p. 165) 


This capacity for “premonition” of the final theory is possible only because 
the fundamental principles of physics share a common bias toward a specific, 
learnable form of simplicity. 


17.5.2 The Centrality of Reliability to 
Representational Naturalism 


The representational naturalist holds that knowledge and intentionality are en- 
tirely natural phenomena, explicable in terms of causal relations between brain 
states and the represented conditions. In the case of knowledge, representational 
naturalism must make use of some form of reliability. The distinction between 
true belief and knowledge turns on epistemic norms of some kind. Unlike some 
Platonists, representational naturalists cannot locate the basis of such norms 
in any transcendent realm. Consequently, the sort of rightness that qualifies a 
belief as knowledge must consist in some relation between the actual processes 
by which the belief is formed and the state of the represented conditions. Since 
knowledge is a form of success, this relation must involve a form of reliability, 
an objective tendency for beliefs formed in similar ways to represent the world 
accurately. 

Thus, if representational naturalism is combined with epistemic realism 
about scientific theories, the conjunction of the two theses entails that our pro- 
cesses of scientific research and theory choice must reliably converge upon the 
truth. 

A naturalistic account of intentionality must also employ some notion of re- 
liability. The association between belief-states and their truth-conditions must, 
for the representational naturalist, be a matter of some sort of natural, causal 
relation between the two. This association must consist in some sort of reg- 
ular correlation between the belief-state and its truth-condition under certain 
conditions (the ‘normal’ circumstances for the belief-state). 

This reliability may be only a conditional reliability: reliability under teleo- 
logical normal circumstances. This condition provides the basis for a distinction 
between knowledge and true belief: an act of knowledge that p is formed by pro- 
cesses that reliably track the fact that p in the actual circumstances, whereas 
a belief that p is is formed by processes that would reliably track p in normal 
circumstances. 

It is possible for our reliability to be lost. Conditions can change in such 
a way that teleologically normal circumstances are no longer possible. In such 
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cases, our beliefs about certain subjects may become totally unreliable. As 
Papineau observed, 


It is the past predominance of true belief over false that is required 
... [This] leaves it open that the statistical norm from now on might 
be falsity rather than truth. One obvious way in which this might 
come about is through a change in the environment. (Papineau, 
1993, p. 558) 


In addition, there may be specifiable conditions that occur with some regu- 
larity in which our belief-forming processes are unreliable. 


This link is easily disrupted. Most obviously, there is the point that 
our natural inclinations to form beliefs will have been fostered by a 
limited range of environments, with the result that, if we move to 
new environments, those inclinations may tend systematically to give 
us false beliefs. To take a simple example, humans are notoriously 
inefficient of judging sizes underwater. (Papineau, 1993, p. 100) 


Finally, the reliability involved may not involve a high degree of probability. 
The correlation of belief-type and represented condition does not have to be 
close to 1. As Millikan has observed, “it is conceivable that the devices that fix 
human beliefs fix true ones not on average, but just often enough” (Millikan, 
1989a, p. 289). For example, skittish animals may form the belief that a 
predator is near on the basis of very slight evidence. This belief will be true 
only rarely, but it must have a better-than-chance probability of truth under 
normal circumstances, if it is to have a representational function at all. 

Thus, despite these qualifications, it remains the case that a circumscribed 
form of reliable association is essential to the naturalistic account of intentional- 
ity. The reliability is conditional, holding only under normal circumstances, and 
it may be minimal, involving a barely greater-than-chance correlation. Nonethe- 
less, the representational naturalist is committed to the existence of a real, 
objective association of the belief-state with its corresponding condition. 


17.5.3 Proof of the Incompatibility 


I claim that the triad of scientific realism (SR), representational naturalism 
(RN), and ontological naturalism (ON) is inconsistent, given the theses of the 
pervasiveness of the simplicity criterion in our scientific practices (PS) and the 
essentiality of reliability as a component of naturalistic accounts of knowledge 
and intentionality. The argument for the inconsistency proceeds as follows. 


1. SR, RN, and ER entail that scientific methods are reliable sources of truth 
about the world. 


As I have argued, a representational naturalist must attribute some form 
of reliability to our knowledge- and belief-forming practices. A scientific realist 
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holds that scientific theories have objective truth-conditions, and that our scien- 
tific practices generate knowledge. Hence, the combination of scientific realism 
and representational naturalism entails the reliability of our scientific practices. 


2. From PS, it follows that simplicity is a reliable indicator of the truth about 
natural laws. 


Since the criterion of simplicity as a sine qua non of viable theories is a 
pervasive feature of our scientific practices, thesis 1 entails that simplicity is a 
reliable indicator of the truth (at the very least, a better-than-chance indicator 
of the truth in normal circumstances). 


3. Mere correlation between simplicity and the laws of nature is not good 
enough: reliability requires that there be some causal mechanism connecting 
simplicity and the actual laws of nature. 


Reliability means that the association between simplicity and truth cannot 
be coincidental. A regular, objective association must be grounded in some 
form of causal connection. Something must be causally responsible for the bias 
toward simplicity exhibited by the theoretically illuminated structure of nature. 


4. Since the laws of nature pervade space and time, any such causal mechanism 
must exist outside spacetime. 


By definition, the laws and fundamental structure of nature pervade nature. 
Anything that causes these laws to be simple, anything that imposes a consistent 
aesthetic upon them, must be supernatural. 


5. Consequently, ON is false. 


The existence of a supernatural cause of the simplicity of the laws of na- 
ture is obviously inconsistent with ontological naturalism. Hence, one cannot 
consistently embrace naturalism and scientific realism. 


17.5.4 Papineau and Millikan on Scientific Realism 


David Papineau and Ruth Garrett Millikan are two thoroughgoing naturalists 
who have explicitly embraced scientific realism. If the preceding argument is 
correct, this inconsistency should show itself somehow in their analyses of sci- 
ence. This expectation is indeed fulfilled. For example, Papineau recognizes the 
importance of simplicity in guiding the choice of fundamental scientific theories. 
He also recognizes that his account of intentionality entails that a scientific real- 
ist must affirm the reliability of simplicity as a sign of the truth. Nonetheless, he 
fails to see the incompatibility of this conclusion with his ontological naturalism. 
Here is the relevant passage: 


It is plausible that at this level the inductive strategy used by physi- 
cists is to ignore any theories that lack a certain kind of physical 
simplicity. If this is right, then this inductive strategy, when ap- 
plied to the question of the general constitution of the universe, will 
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inevitably lead to the conclusion that the universe is composed of 
constituents which display the relevant kind of physical simplicity. 
And then, once we have reached this conclusion, we can use it to ex- 
plain why this inductive strategy is reliable. For if the constituents 
of the world are indeed characterized by the relevant kind of physi- 
cal simplicity, then a methodology which uses observations to decide 
between alternatives with this kind of simplicity will for that reason 
be a reliable route to the truth. (Papineau, 1993, p. 166) 


In other words, so long as we are convinced that the laws of nature just 
happen to be simple in the appropriate way, we are entitled to conclude that 
our simplicity-preferring methods were reliable guides to the truth. However, 
it seems clear that such a retrospective analysis would instead reveal that we 
succeeded by sheer dumb luck. 

By way of analogy, suppose that I falsely believed that a certain coin was 
two-headed. I therefore guess that all of the first six flips of the coin will turn 
out to be heads. In fact, the coin is a fair one, and, by coincidence, five of the 
first six flips did land heads. Would we say in this case that my assumption 
was a reliable guide to the truth about these coin flips? Should we say that its 
reliability was 3? To the contrary, we should say that my assumption led to 
very unreliable predictions, and the degree of success that I achieved was due 
to good luck, and nothing more. 

Analogously, if it is a mere coincidence that the laws of nature share a 
certain form of aesthetic beauty, then our reliance upon aesthetic criteria in 
theory choice is not in any sense reliable, not even minimally reliable, not even 
reliable in ideal circumstances. When we use the fact that we have discovered a 
form of “physical simplicity” in law A as a reason for preferring theories of law 
B, which have the same kind of simplicity, then our method is reliable only if 
there is some causal explanation of the repetition of this form of simplicity in 
nature. And this repetition necessitates a supernatural cause. 

Papineau recognizes that we do rely on such an assumption of the repetition 
of simplicity. 


The account depends on the existence of certain general features 
which characterize the true answers to questions of fundamental 
physical theory. Far from being knowable a priori, these features 
may well be counterintuitive to the scientifically untrained. (Pap- 
ineau, 1993, p. 166) 


Through scientific experience, we are “trained” to recognize the simplicity 
shared by the fundamental laws, and we use this knowledge to anticipate the 
form of unknown laws. This projection of experience from one law to the next 
is reliable only if there is some common cause of the observed simplicity. 

Similarly, Millikan believes that nature has trained into us (by trial-and 
error-learning) certain “principles of generalization and discrimination” (Millikan, 
1989a, p. 292) that provided us with a solution to the problem of theoretical 
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knowledge that was “elegant, supremely general, and powerful, indeed, I believe 
it was a solution that cut to the very bone of the ontological structure of the 
world” (Millikan, 1989a, p. 294). However, Millikan seems unaware of just how 
deep this incision must go. A powerful and supremely general solution to the 
problem of theory choice must reach a ground of the common form of the laws 
of nature, and this ground must lie outside the bounds of nature. 

Papineau and Millikan might try to salvage the reliability of a simplicity 
bias on the grounds that the laws of nature are, although uncaused, brute facts, 
necessarily what they are. If they share, coincidentally, a form of simplicity and 
do so non-contingently, then a scientific method biased toward the appropriate 
form of simplicity will be, under the circumstances, a reliable guide to the truth. 

There are two compelling responses to this line of defense. First, there is 
no reason to suppose that the laws of nature are necessary. Cosmologists often 
explore the consequences of models of the universe in which the counterfactual 
laws hold. 

Second, an unexplained coincidence, even if that coincidence is a brute-fact 
necessity, cannot ground the reliability of a method of inquiry. A method is 
reliable only when there is a causal mechanism that explains its reliability. By 
way of illustration, suppose that we grant the necessity of the past: given the 
present moment, all the actual events of the past are necessary. Next, suppose 
that a particular astrological method generates by chance the exact birthday of 
the first President of the United States. Since that date is now necessary, there 
is no possibility of the astrological method’s failing to give the correct answer. 
However, if there is no causal mechanism explaining the connection between 
the method’s working and the particular facts involved in Washington’s birth, 
then it would be Pickwickian to count the astrological method as reliable in 
investigating this particular event. 

Analogously, if the various laws of nature just happen, as a matter of brute, 
inexplicable fact, to share a form of simplicity, then, even if this sharing is a 
matter of necessity, using simplicity as a guide in theory choice should not count 
as reliable. 

In my chapter, “The Incompatibility of Naturalism and Scientific Realism,” 
in the forthcoming anthology Naturalism: A Critical Appraisal (Craig and More- 
land (2000)), edited by William Lane Craig and J. P. Moreland, I give a fuller 
version of this argument. I also deal in that chapter with alternative accounts 
of the role of simplicity, such as that of Forster and Sober (194), Reichenbach 
(1956), and Turney (1990). I show that, in each case, the rationales given for 
the use of simplicity as a criterion are inadequate to salvage a genuine scientific 
realism. 


17.5.5 The Ramsey-Lewis Account of Laws 


Frank Ramsey Ramsey (1990) and David Lewis Lewis (1994) have proposed 
an account of the nature of natural law that would dispose of any need to 
explain the reliability of simplicity as an indicator of genuine lawhood. Their 
account simply identifies the laws of nature with the axioms of the best theory 
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of the world, where best is cashed out in terms of such virtues as simplicity, 
strength, and fit with the empirical data. Hence, it becomes an analytic truth 
that simplicity is a criterion of the lawfulness of a confirmed generalization. 

However, the Ramsey-Lewis account fails to satisfy my definition of scien- 
tific realism, since on their account, what the actual laws of nature are is not a 
fully objective matter. Which generalizations are laws depends in part on our 
preferences and practices, in particular, on our preferences for certain kinds of 
simplicity. Lewis suggests (Lewis, 1994, p. 479) that if nature is “kind”, the 
subjectivity of his account can be somewhat mitigated: it may be that there is 
a single system of laws that is robustly best, best under a variety of conceptions 
of simplicity. This is only a somewhat mitigated form of subjectivism, however, 
since if the simplicity criterion is to have any real bite, our preference for sim- 
plicity must play an ineliminable role in determining which generalizations are 
in fact laws of nature. 

In addition, Lewis cannot take seriously Weinberg’s suggestion that we are 
learning the correct aesthetic in response to our more and more extensive inter- 
actions with nature. Weinberg’s view takes for granted that it is a very specific 
and constrained conception of simplicity that guides science at each point, that 
this conception of simplicity changes substantially over time, and that as we 
learn more about the aesthetic properties the true laws of nature share, we 
become better at identifying new laws of nature. The Ramsey-Lewis account 
assumes that the relevant conception of simplicity is generic and fixed, and it 
provides no way of making sense of a learning process by which our aesthetic 
sense becomes better attuned to that of the universe. 

In case any doubt remains about the lack of objectivity in our knowledge 
of natural law on the Ramsey-Lewis account, consider this fact: there is no 
possibility on the Ramsey-Lewis account of a causal connection between the 
facts about natural law and our opinions about those facts. For Ramsey and 
Lewis, Humean supervenience is a given: the modal and stochastic facts are 
wholly determined by the distribution of occurrent properties in the actual 
world. Which statements are in fact laws of nature depends on the whole course 
of the actual world, past, present, and future. A genuine law of nature must fit 
the occurrent facts of the future as well as the past. Thus, the fact that some L 
is a law of nature supervenes on occurrent facts spread throughout time. This 
latter fact cannot be a cause of our current opmmonsy since much of it lies in the 
future, causally posterior to our opinions. 

The need for a causal connection between the laws of nature and our scien- 
tific beliefs (the kind of connection that the Ramsey-Lewis account precludes) 
can be seen by considering, once again, Gettier-like examples of failed knowl- 
edge. Consider the following counterfactual world. Newton bases his theory 
of the inverse square law of gravitation almost entirely on observations of the 
movements of the planets, which exactly match the observed movements of the 
planets in the actual world. However, in this hypothetical world, the planets 
move they way they do because they are firmly attached to a system of ellip- 
tical rail lines in space, constructed millions of years ago by visitors from the 
Andromeda galaxy. These Andromedans built the rail lines in conformity to 
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certain complex religious beliefs they held, and the fact that the lines force the 
planets to move in exactly the orbits they do has in fact nothing whatsoever 
to do with the force of gravity. In such.a case, Newton’s beliefs about gravity 
would be as true and as justified as they are in the actual world, but they clearly 
would not constitute knowledge of the nature of gravity, because of the lack of 
the right sort of causal connection between Newton’s theory and the inverse 
square law itself. 

This lack of causal connection between the laws of nature and our scientific 
beliefs means that Ramsey and Lewis cannot give a teleological account of either 
the content of our beliefs about natural law or of our knowledge of natural law. 
If we identify what is rational with what is required by the “design plan” (to 
use Plantinga’s phrase) of our mind, that is, with the fulfillment of the proper 
functions of our ratiocinative faculties, then our reasoning about natural law 
falls outside the scope of reason. To treat natural laws as Ramsey and Lewis 
suggest we do must be to follow some purely positive social norm, to conform 
to a practice that is a non-adaptive product of historical accident. Our beliefs 
about natural law are, therefore, not genuinely about the world at all, but 
merely embodiments of this non-functional practice. 

In contrast, a modal realist can give a perfectly good account of our law- 
inducing practices in terms of natural selection. Since natural law (and its 
corollary, objective chance) causally affects the future course of events, it is 
plainly a matter of adaptive fitness to be well attuned to the actual laws of 
nature and the actual objective chances. These same laws and chances have 
been active in shaping the past, so it is possible for nature (despite her myopia 
about the future) to have succeeded in selecting for this attunement. Observed 
patterns and frequencies in the past are, thanks to the reality of law and chance, 
fallible but reliable indicators of future patterns and frequencies. 

Since reason is simply the fulfillment of the mind’s proper functions, it is 
paradigmatically rational for the mind to practice inference to the best theory 
(including the progressive improvement in the standards of goodness in theories 
that Weinberg describes). 


17.6 When Does Bayesian Learning 
Constitute Knowledge? 


Bayesian learning consists in updating one’s subjective probabilities in light of 
new evidence by conditionalizing — by making the posterior probability P’(A) 
be equal to the prior conditional probability P(A/E), where EF represents the 
new information learned with certainty. There a number of Bayesian conver- 
gence results, demonstrating that, in the infinite long run, the probability of 
empirical hypotheses (hypotheses for which there is no underdetermination of 
theory by data) will converge (with a subjective probability of 1) to a single 
value, washing out the effects of differences in the original priors. 

However, we are interested in more than just convergence to agreement — we 


Teleological Reliabilism 233 


are also interested in convergence to knowledge. When does Bayesian learning 
converge in the long run to a state of knowledge? Let us suppose that the 
Bayesian is estimating the probability of a given outcome (such as “heads”) in 
each of a series of ‘exchangeable’ trials (such as flips of the same coin in the same 
conditions).? In the long run, all Bayesian learners will converge to the same 
value as the true, ‘objective’ probability of this outcome in these trials. From 
the viewpoint of teleological reliabilism, a convergence to a range r of objective 
probabilities constitutes knowledge when the following six conditions are met: 


1. The trials in the series are objectively exchangeable, in the the sense that 
they all have approximately the same objective probability, and this ob- 
jective probability of heads is in fact in the range r. 


2. The subjective exchangeability of the trials (the symmetry of the probabil- 
ities of permutations of outcomes in the subjective prior) carries robustly 
the information that these trials are objectively exchangeable. 


3. The subjective exchangeability has the proper function of carrying this 
information robustly. 


4. The actual outcomes of the trials were causally irrelevant to the deter- 
mination of which of the trials were observed (i.e., there was no causally 
grounded bias in the selection of the observed cases). 


5. The number of observed trials was great enough to make the objective 
probability of convergence to r very high (this is the condition to which 
various convergence results, including the law of large numbers, are rele- 
vant). 


6. The fact that the hypothesis that the objective chance of “heads” had a 
finite (non-zero) prior probability was itself an instance of partial knowl- 
edge. 


The last condition introduces the notion of partial knowledge. Partial knowl- 
edge of p consists in a state in which p is true, and p belongs to some set 7 of 
mutually exclusive propositions, where each member of 7 is given a finite, non- 
zero probability, and where this assignment of prior probabilities to the members 
of 7 has the proper function of robustly carrying the information that the dis- 
junction of the members of 7 is true. In other words, p itself is a cause of the 
assignment of positive probability to p, and this causal chain accords with the 
proper functioning of the believer’s subjective-probability state. The believer 
must know (in the teleological-reliabilist sense) that the disjunction of 7 is true. 


3In de Finetti’s convergence result (de Finetti (1980)), a series of trials is exchangeable 
just in case the prior probability of any two series of outcomes is equal whenever the series 
are permutations of each other, that is, whenever the number of the various outcomes is the 
same in each series. Exchangeability represents a kind of symmetry in the assignment of prior 
probabilities. 


234 Realism Regained 


This sixfold condition is needed to exclude Gettier-like examples of Bayesian 
convergence to the truth that fails to constitute knowledge. Consider the fol- 
lowing Gettier cases: 


1. The Bayesian converges to the correct objective probability, but does so 
because some all-powerful genie made visible a series of outcomes selected 
because of their conforming to the genie’s favorite pattern, which just 
happened to coincide statistically with the objective chance. 


2. The Bayesian’s normal prior probability function would have assigned zero 
to the probability of the hypothesis that the objective chance was r, but, 
fortunately, a blow to the head caused the Bayesian to assign a non-zero 
probability to this hypothesis. 


3. The trials in the series were both subjectively and objectively exchange- 
able, but they were subjectively exchangeable only because the Bayesian 
learner wrongly thought that each of the trials occurred on a weekday. 
Had the Bayesian learner discovered that many of the trials actually oc- 
curred on weekends, the trials would no longer have been subjectively 
exchangeable, and no convergence to the truth would have occurred. 


In each of these cases, the Bayesian would have converged to the truth as 
a result of perfectly correct applications of conditionalization, but the resulting 
rational true belief with certainty would not have constituted knowledge of the 
objective chance. 


17.7 Objective Chance and Empiricism 


The notions of metaphysical necessity and objective chance play fundamental 
roles in my account of causation and, consequently, in my accounts of knowledge 
and the mind. A number of epistemological challenges to modality and objective 
chance have been lodged in recent years by empiricists such as John Earman, Bas 
van Fraassen, and David Lewis. These include the non-supervenience of chance 
on occurrent fact and the problem of finding a rational basis for a connection 
between subjective and objective probability. 

Earman (1984) argues that objective chance cannot be acceptable to an em- 
piricist unless it supervenes on occurrent facts. He calls this the “acid test” of 
empiricism. I am dubious about the viability of a distinction between ‘occur- 
rent’ and ‘dispositional’ or ‘modal’ facts or properties. It may be that all the 
properties with which we are familiar are at least partly dispositional or modal 
in character. However, for the sake of argument, I am willing to concede that 
we can make some sort of sense of a occurrent/dispositional distinction. On 
any reasonable view of objective chance, objective chance does not supervene 
on such occurrent facts. 

Earman’s insistence of supervenience assumes that the only properties that 
can be observed are occurrent properties. This is, of course, the essence of 
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Humean philosophy. It may be, as Hume thought, that no dispositional prop- 
erty is perceived qua dispositional property — in its guise as the particular 
dispositional property it is — in single, isolated cases of sensory perception. 
Even if we grant this, it does not follow that dispositional facts and properties 
(including facts about objective chance) are not perceived at all. There remain 
at least two possibilities: (1) we perceive the dispositional property in a single 
case, but we do not perceive its internal structure (we do not perceive it as 
dispositional, and we do not perceive which disposition it is), or (2) our per- 
ception of the dispositional property qua disposition emerges in the context of 
perceiving a series of relevant cases. It is the second possibility that I want to 
pursue here. 

Humeans may object that I am not entitled to use the word ‘perception’ 
in describing knowledge of objective chance that arises from a long series of 
separate observations. I would ask the Humean to consider the following illus- 
tration. I cannot perceive the Louvre in a single experience — my perception of 
the Louvre only emerges through a long series of separate experiences, experi- 
ences of various aspects of its exterior and of the contents of its various salons. 
Nonetheless, it would seem odd to insist that the Louvre is imperceptible. Sim- 
ilarly, the objective chance of a tossed coin’s landing heads cannot be perceived 
in a single observation, but our perception of it emerges from a series of separate 
observations of coin tosses. 

What makes the objective chance perceptible is the existence of a causal 
chain of the right kind between the objective chance and corresponding mental 
states. This is also what makes various occurrent properties perceptible. Ear- 
man may be assuming that only occurrent properties can enter into such causal 
connections. One of the principal tasks that I undertook in part I was to expose 
the groundlessness of this assumption. 

Earman could still insist that our beliefs about objective chance are better 
described as formed by inference rather than by perception. Once again, this 
is a distinction of dubious value. For the point of view of teleological reliabil- 
ism, what matters is whether a belief is formed in the proper, reliable manner, 
informed by the appropriate factual situation. Whether this process is best 
described as one of ‘perception’ or ‘inference’ is a secondary matter. However, 
once again, let’s set these caveats aside for the sake of argument and suppose 
that beliefs about objective chance are based on inferences from observations. 
Why does this necessitate a principle of modal and stochastic supervenience? 

Earman seems to be assuming that the only form of inference that can 
ground inferential knowledge is deductively valid inference, inference in which 
there is an absolute guarantee that truth is preserved. This is of course, another 
typically Humean dogma. Unless Earman assumes this, there is no reason why 
I cannot say both that dispositional and other modal beliefs are inferred from 
observations of occurrent fact, and that modal and dispositional facts do not 
supervene on the occurrent facts. Supervenience is a very strong condition: it 
means that it is impossible for the dispositional facts to vary once the occurrent 
facts are fixed. Supervenience could fail, and it could still be true that one can 
reliably infer the dispositional facts from the occurrent ones. 
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As a reliabilist, I would respond by insisting that the Humean sets the stan- 
dard for the reliability of inference far too high. I can know, on the basis of a 
large number of observations of occurrent fact, that the objective probability 
of some outcome lies in range r, even though it is possible for the very same 
occurrent facts to be actual in a world in which the objective probability falls 
outside of r. The mere metaphysical possibility of error is not sufficient grounds 
for denying a claim to knowledge, unless we insist on repeating Descartes’s 
fundamental error. 

In characterizing the reliability of statistical inference, it is important to 
bear in mind two facts. First, all information is relational, in the sense that an 
information connection always takes this form: a fact (s, @) carries the informa- 
tion that some token of type w exists in relation R to s. Second, information is 
often abstract, involving quantification over types. Thus, I do not acquire the 
information from a series of observations that the objective probability of some 
type @ lies in a range r: instead, I acquire the information that there is some 
type x in relation R to the token-observations such that the objective proba- 
bility of é conditional on x is in range r. This means that the reliability of an 
information channel can be evaluated without bringing in higher-order objective 
probabilities. That is, I do not want to say that there is some objective chance 
that the objective chance of w lies in r, conditional on the observation series s, 
since this presupposes that it makes sense to talk about the objective chance 
of the objective chance of w. Instead, I want to say that there is a conditional 
objective chance that, given series s of observations realizing type ¢, there exists 
a type x that is realized in relation R to each member of the series s such that 
the real objective chance of y) conditional on z lies in the range r. The reliability 
of a statistical method is measured, not by hypothetically varying the world’s 
objective chance function, but by varying the values of the parameters of the 
trials which determine the objective chances of the outcome in situ. 

To evaluate the reliability of some belief-forming process, we need to discover 
whether it causes belief to track the truth. This means that we must consider 
the beliefs that would be formed across a range of hypothetical situations. In 
evaluating statistical inference, where the beliefs that are formed are judgments 
of objective probability, we must decide how to vary the objective probabilities 
being estimated. It would be problematic to make the objective chance function 
itself vary, since this would require us to make judgments of higher-order chance, 
the chance that objective chance might vary in certain ways. The alternative 
that I am suggesting involves varying some parameter shared by the trials for 
which the objective chance of the outcome is being estimated. This means 
leaving the objective chance function itself unchanged but instead changing the 
condition on which objective chance is being evaluated to a different condition 
on which the objective chance function determines a different probability-value 
for the relevant outcome. 

By way of illustration, suppose that I am trying to estimate the objective 
chance of “heads” in a series of identical tosses of an unchanging coin. Suppose 
that I observe thirty “heads” outcomes in a row. I estimate that the objective 
chance of “heads” is at or very near 1. To test the reliability of my method of 
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forming such estimates, we would consider what estimates I would have formed 
had certain relevant parameters of the coin or the tosses been different. We can 
estimate the objective chance of these alternative parameter-values, by consid- 
ering the processes by which the coins or the tosses were produced (the distri- 
bution of weight in the coin, how hard the coin is tossed, etc.). We can then 
measure the reliability of my inference by measuring how likely it is that my 
estimate would have varied significantly from the objective chance that would 
have resulted from these alternative parameter-values, weighted by the objective 
chance of the alternatives. 

Van Fraassen’s problems with objective chance (van Fraassen, 1987, pp. 38- 
39,80-86) lie in a different quarter. Van Fraassen poses a dilemma for the 
objectivist: he must solve either the identification problem or the inference 
problem. The identification problem is the problem of identifying what sort of 
facts in the world make claims about objective chance true. The demand for a 
solution of the identification problem is essentially a demand for the reduction of 
stochastic facts to occurrent facts. I take modal and stochastic facts as primitive, 
irreducible constituents of the world, so (unlike Armstrong and Tooley) I decline 
to attempt a solution to the identification problem. 

Since I decline the identification problem, I must face van Fraassen’s infer- 
ence problem. In the case of objective chance, the inference problem is the 
problem of explaining the basis for a rational constraint on the relationship be- 
tween subjective and objective probability. I go along with most objectivists in 
accepting Miller’s principle (which David Lewis calls the “Principal Principle”): 


Prgupj (6/ Pros; (9) € r) er 


According to Miller’s principle, my subjective probability for ¢, conditional 
on the supposition that the objective probability of ¢ lies in the interval r, must 
itself lie in the interval r. The problem that van Fraassen raises is this: what is 
the basis for this “must”? Appeals to pragmatic coherency, including appeals to 
the rational necessity of immunity to Dutch books, are of no avail in supporting 
Miller’s principle in this form, since all such appeals are concerned solely with 
the relation of subjective probabilities to other subjective probabilities. None of 
these arguments can ground a constraint on the relationship between subjective 
and objective probabilities. If a solution to van Fraassen’s inference problem 
must take the form of such an appeal, then no solution is possible. 

However, I am a thoroughgoing primitivist. I claim that Miller’s principle is 
a primitive demand of reason, for which no further justification is necessary or 
possible. I thus decline van Fraassen’s second problem as well. Why does van 
Fraassen think that the rationality of Miller’s principle needs to be grounded 
in an argument from consistency? Like Earman, and like all Humeans, van 
Fraassen thinks that the only form of inference for which no justification is 
needed is deductive inference.. This means that the only form of rational co- 
herency that van Fraassen can recognize is some form of deductive consistency, 
including the probabilistic generalization of deductive consistency, namely, ab- 
solute immunity to Dutch books. 
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In contrast, I see deductive and non-deductive inference as very much on 
a par. In both cases, the inferences are rational because they are required by 
the fulfillment of the proper functions of the human mind. Since the point of 
human inference is the extension of knowledge, we can infer that any rational 
form of inference will be a reliable form of inference. In deductive inference, 
this reliability reaches its apex: the metaphysical impossibility of error. In 
many cases, the reliability of rational inference does not reach so high. 

Analogously, the price of deductive inconsistency, including probabilistic in- 
consistency, is very high: one is in a state that is necessarily less than optimal 
from the point of view of truth. Other forms of rational incoherency, such as the 
violation of Miller’s principle, come with a lower, but still very substantial, price. 
There is no proof that I will necessarily go wrong if I violate Miller’s principle, 
but the objective probability that I will go wrong is high, higher the greater is 
my deviation from the principle. One who conforms to Miller’s principle is more 
likely (objectively speaking) to succeed in the long run. This, together with the 
fact that the proper function of subjective probability consists in aiming at the 
maximization of the objective expectation of utility, is enough to ground the 
rationality of Miller’s principle. 

Van Fraassen insists on an internal characterization of rationality — ulti- 
mately, one that can be cashed out in terms of internal consistency. It is not 
surprising that from such a perspective van Fraassen cannot make sense of any 
primitive rational constraints linking subjective and objective probability. In 
contrast, the teleological reliabilist’s approach to the characterization of ratio- 
nality is thoroughly externalist. Rationality is about getting to the truth with a 
high degree of objective rationality. Violations of Miller’s principle violate this 
external end: hence, they are irrational. 

If Earman, van Fraassen, and Lewis are right in rejecting objective chance, 
then there is no possibility of a teleological account of our inductive practices. 
We might call this the Darwinian problem of induction: how can we explain 
induction as something for which nature selects? If there is no objective chance, 
then there is no causal factor lying behind both past, observed frequencies and 
future, to-be-encountered frequencies. Nature only cares about the latter, but 
she has access only to the former. Nature selects for things that contribute to 
our fitness, which is forward looking. The fact that our subjective beliefs are 
well attuned to past frequencies has nothing to do with our present reproductive 
fitness, which has to do with the chances of our survival and reproduction in the 
future. However, it is impossible for natural selection to bring about directly our 
attunement to future frequencies, since the events constituting these frequencies 
are causally posterior to our current beliefs and inferential faculties. 

Induction contributes to our fitness only if there is a causal explanation that 
links attunement to past frequencies with attunement to future frequencies. 
This causal explanation must make reference to objective chance as the tertium 
quid. Why then is it reasonable for me to conform to Miller’s principle? The 
function of subjective degrees of belief is to be the best possible estimate of 
objective chance. The more closely our subjective degrees of belief approximate 
the objective chances, the greater is the objective chance that the act that 


Teleological Reliabilism 239 


maximizes our subjective expectation of success will also maximize our objective 
expectation of success. To fail to follow Miller’s principle is to guarantee a 
discrepancy between subjective belief and objective chance. To do this is to 
frustrate the mind’s proper function, a paradigmatic case of irrationality. 


18 


Enduring Substances 
and Their Identities 


18.1 Substances as Logical Constructions 


The ontologically basic components of the world are situation-tokens and situation- 
types. Situation-tokens encompass such things as events, states, processes, his- 
tories, and (in one sense of the word) facts. Situation-types are predicables: 
multiply-instantiated properties and relations of situation-tokens. 

The most prominent inhabitants of the world are enduring substances, spa- 
tially extended things, typically composed of matter, that experience change, 
that are sometimes created and destroyed, and that have histories. Enduring 
substances include such things as living organisms, artifacts, discrete, homoge- 
neous masses, and social institutions. 

Enduring substances, along with facts about their identities and varying 
properties and relations, are logical constructions from situation-tokens and 
types. In the final analysis, reality consists merely of situation-tokens. Sub- 
stances are really structures of situation-tokens considered or treated in a special 
way. 

The first step in this construction is the definition of a substance history. 
A substance history is a stable, causally connected series of situation-tokens. 
Formally, a substance history is a pair (C,¢), where C is a finite sequence 
(c1,-.+,¢€n) of situation-tokens, and ¢ is a situation-type, where: 


1. each c; is of type @, 


2. for each i < n, c;’s being of type ¢ causally explains c,;41’s being of type 
¢, and 


3. there is no sequence C’ meeting conditions (1) and (2) such that C is a 
proper sub-sequence of C’. 
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A substance history is a self-perpetuating stability. The situation-type ¢ is 
the sortal whose persistence unifies the history. 

For example, suppose we have an isolated blob of mercury (causally isolated 
from any other mercury) with one gram of mass. There are a series of situation- 
tokens of a particular type, namely, being homogeneously mercurial and having 
one gram of mass, and for each member of the series, the fact that it is of 
this type explains, via the law of conservation of mass and various principles 
of inertia, the fact that its successor is also of this type. If the blob were not 
causally isolated from a second gram of mercury, then it would not be the case 
that there is a substance history associated with the first gram of mercury alone. 
Unless we can causally isolate the two grams of mercury (say, at the atomic 
level), we cannot use laws of conservation to provide separate explanations of 
the persistence of each of the two grams of mercury. Instead, the persistence 
of the two grams of mercury would constitute a single, indivisible substance 
history. 

For a second example, we could consider Aristotle’s bronze statue. In the 
actual world, the history of the mass of bronze overlaps with the history of the 
statue per se. The mass of bronze could have existed, as a discrete, homogeneous 
quantity of alloy, before the statue came to exist, and it could go on existing 
after the statue ceased to exist. Conversely, the statue can continue to exist 
even after some part of the mass of bronze has corroded away. Qua mass of 
bronze, the sortal unifying one history is that of being a homogeneous mass of 
bronze of a certain quantity. Qua statue, the unifying sortal is a higher-order, 
functional property: that of serving some public or aesthetic purpose. 

The histories of living organisms provide very clear examples of substance 
histories. The type that is sustained throughout the substance history of an 
organism is a conjunction of higher-order, teleofunctional types. This accounts 
for the fact that we can continue to have a single history, even though the stages 
vary widely in many first-order physical properties (from weighing only a few 
grams to weighing far too many kilograms, for instance). Living systems are 
self-perpetuating at the functional level: the fulfillment of biological functions 
at one stage causally explains their fulfillment at the succeeding stage. 

To each substance history Cy, there is a corresponding substance Cr. Sub- 
stances must not be identified with substance histories simpliciter. substance 
histories are series of situation-tokens, and substances are not. Substance his- 
tories are spread out across time, while substances endure through time. Sub- 
stances are a kind of logical construction out of substance histories. The prop- 
erties of substances must be explained in terms of corresponding properties of 
histories. My aim is to reduce substance-talk to situation-talk. If I am success- 
ful, I will have saved the appearances, in the sense that our ordinary uses of 
substance language will come out as largely true under my analysis. 

To each situation-type 7, there is a corresponding substance-type *. These 
two types are not identical: they characterize different categories of beings and 
have quite different relations to locations in space and time. If Cy is a substance 
history, s is a constituent token of Cy, and ~ is a situation-type, then: 
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sk (Cele vt) osky 


Informally, substance or has property v* at situation-token s just in case s 
belongs to the corresponding substance history Cy and s is of the corresponding 
situation-type ~. A similar account of relations between substances can be given 
in terms of relations between token-constituents of the two substance histories. 

Change with respect to 7)* between situation s and s’ can be accounted 
for as follows: s and s’ are both constituents of substance history Cy, with s 
causally prior to s’, s is of type w and s’ is of type 77. In such a case, we say 
that substance cy has changed from W* to —~w*. 


It is important to bear in mind that a substance-property such as w* is not 
a relational property: if s = (C# |= w#), we are not to think of * as a relation 


between Cy and s. Rather, the attribution of ~* to ce is true at s. The whole 
construction (CF l= w*) is essentially a complex situation-type. 


Substance CF is identical to substance a if and only if ¢ and w are nec- 
essarily co-extensive, and there is some initial segment C-; of sequence C' and 
some initial segment Dz, of D such that Cz; = De;. This stipulation captures 
the Kripkean view that it is the sortal and the origin, and only these, of a sub- 
stance that are essential to its identity. If C and D share some initial segment, 
and both are substance histories relative to some sortal ¢, then they represent 
two possible life-stories for the substance Cy, one, for example, in which the 
substance becomes a philosopher, and another in which it becomes a stockbro- 
ker. It is the shared origin, together with the shared sortal ¢#, that makes these 
two possible stories of the same substance. 

Tensed properties, such as ‘having been $*’, or ‘going to be ¢*’, can also 
be constructed in a similar fashion. In the case of future-tensed properties, we 
must make reference to a world (selecting one of the possible futures of the 
situation-token), as well as to the situation-token itself. Let ‘P’ be an operator 
representing the simple past, and ‘fF’ an operator representing the simple future. 
Then we can introduce truth definitions as follows: 


M, 5 — P(CH |= ¥*) @ As'(s' < s& M, 8! - (C# = Y*)) 


M,w,s & F(C3 |= pe) # ds’ C w(s ~ 8’ & M, 5! & (CF |= p*)) 


Informally, a past-tense attribution of a property to an enduring substance 
is true at a given situation-token if there is some token in its past at which 
the untensed attribution is true. A future-tense attribution of a property to an 
enduring substance is true at a given situation token and a given world if there 
is some token contained by that world and posterior to that token at which the 
untensed attribution is true. Reference to a world is ineliminable in the case 
of a future-tense attribution, since there are many alternative futures for any 
given token, but only one past. 
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A tensed attribution of a property to a substance can be true at a situation- 
token, even if that token is not part of the corresponding substance history. 
Consequently, a past-tensed attribution of a property to a substance can be true 
long after the substance has ceased to exist, and a future-tensed attribution can 
be true before the substance has come into being. 


18.2 Change and the Johnston Paradox 


Mark Johnston (1984) and David Lewis (1986a) have argued that the common- 
sense theory of enduring substances is incoherent. They argue that one and 
the same substance cannot have (as a whole) two contradictory properties at 
different times, since there is no one to make coherent sense of this tensed pred- 
ication of different properties to the same entity. Their solution is to identify 
substances with substance-histories and to argue that it is temporal parts of the 
substance/history that have contradictory properties, not the whole. 

Peter Simons (Simons, 1991, pp. 134-135) has produced a convenient typol- 
ogy for classifying different approaches to this problem of tensed predication. 
We can parse a proposition of the form A is F at t in six different ways: 


1. A is-F-at t 

A is F-at-t 

A is-at-t F 

(A is F) at t 
A ((is F) at t) 
A-at-t is F 


a 7 RF wow 


Johnston and Lewis advocate option 6, in which A-at-tis taken to refer to the 
temporal part of A located at time ¢. They take option 1 as misrepresenting the 
intrinsic property F (like having two legs or weighing ten stone) as a relational 
property is-F-at, a relation between a thing and a time. Options 2, 3, and 5 
also seem to deny the possibility of change, since each substance timelessly and 
eternally possesses the property of being F-at-t, or the property of being-at-t F. 

My own account is clearly one that takes option 4: (A is F) at t. Tensed 
predication of an intrinsic property to a substance is merely one special case of 
a universal phenomenon: the supporting of situation-types by situation-tokens. 
The time index ¢ is merely a stand-in for some particular situation token lo- 
cated at t. The predication (A is F) is itself a situation-type, verified by some 
situation-tokens, falsified by others, neither verified nor falsified by still others. 
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18.3. Zeno’s Paradox and the Instant of Change 


The ancient Eleatic philosopher Zeno raised a number of difficult paradoxes 
concerning the logic of change. One of these concerned the classification of the 
substance at the instant of change. For example, at the moment of death, is the 
human alive or dead? If we say alive, then it would appear that death has not 
yet occurred at the moment of death. If we say dead, then it would appear that 
death has already occurred at the moment of death. Each of these conclusions 
is contradictory. 

Zeno’s paradox is an artifact of our imposing the continuum of metrical time 
upon what is in reality a set of discrete phenomena. In reality, a state of life 
is immediately followed by a state of death. In locating these states within a 
system of time measured by the real numbers, we introduce a new, virtual state, 
the instant of death, that has no real existence. Hence, the question of whether 
the human is alive or dead at that instant has no principled answer. Since the 
instant is the artifact of our own theorizing, we are free to stipulate any answer 
we please. 

What happens when a situation-token s includes temporal parts both before 
and after the change? Should we say that it supports contradictory types, both 
the human is alive and the human is not alive? No, we must recognize that 
these types are not mereologically persistent: each is supported by a part of s, 
but not by the whole. We can find persistent types for such a case: the human 
is alive at some time and the human is not alive at some time. Both of these 
non-contradictory types are supported by s. 


18.4 Hard Cases for Substance Identity 


18.4.1 Autocatalysis and Biological Reproduction 


There are several examples that suggest that my definition of substance history 
(and, consequently, of substance identity) is too liberal. First, consider processes 
of autocatalytic reactions. The presence of a particular molecule in solution 
causes new instances of that molecule to come into existence. We do not want 
to say that the original molecule is identical to each of the products. A similar 
example would be the replication of a particular crystalline form during the 
solidification of some liquid. 

I would argue that in each of these cases, the linkage between one situation- 
token and the next is too weak. The existence of the molecule is not, all by 
itself, a causal explanation of the subsequent existence of the duplicate. What is 
needed is the existence of a favorable environment for autocatalysis. We could 
identify the entire system of molecules and solution as an enduring substance, 
but this seems all right. Similarly, in the case of the replication of a crystalline 
arrangement, the seed crystal is not by itself a sufficient causal explanation for 
the subsequent crystalline layers. 

But now, have I made the condition too stringent? Is the existence of a 
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living organism at one moment a sufficient causal explanation of its existence at 
the next? Doesn’t the environment contribute something to the survival of the 
organism, just as it does in the case of autocatalysis or crystalline replication? 
This objection assumes that there is no information about the environment in 
the substance history of a living organism. What persists from one moment to 
the next is a system of functional organization, including both intrinsic functions 
(within the body of the organism) and extrinsic functions (in its environment). 
There is a single, indivisible system of organization for each living organism, 
but these systems overlap considerably in their shared environment. 

Asexual biological reproduction provides another difficult test case. The ex- 
istence of the living parent at one stage is causally sufficient for the existence of 
each of the offspring at a later stage. Now, clearly there is something that en- 
dures through reproduction, namely, the species itself as a substance. However, 
the individual organism does not survive meiosis, since the normal organization 
of the parent does not directly cause the corresponding organization of the chil- 
dren. There is an intermediate stage during which the normal functioning of 
the parent is disrupted. During that stage, the parent ceases to exist, and after 
the division is complete, two or more new individuals come into existence. This 
analysis agrees with that of Peter van Inwagen’s (van Inwagen (1990)), who also 
locates the persistence of living things in the continuity of the process of life. 

To take another example, suppose a woman dies at the moment of giving 
birth (or at whatever moment we take as constituting the beginning-to-be of 
the woman’s child). Does the woman-child history constitute the basis for the 
existence of a single human organism? No, because we do not have direct 
causal connections between successive stages of the various processes that make 
up human life: respiration, locomotion, digestion, perception, and so on. The 
woman’s last state of respiration is not directly connected to any of the initial 
states of biological activity of the child, and the same thing holds for the other 
organic processes. The causal connection is between a process of reproduction 
on the woman’s side and life on the child’s, not between one stage of life (as a 
whole) and the next. 


18.4.2 Fission and Fusion of Individuals 


The philosophical literature on personal identity is full of science-fiction scenar- 
ios involving the transplantation of all or part of a human brain into another 
human body. Since humans can apparently survive with somewhat less than half 
of their cerebral cortex intact, these scenarios raise the possibility of duplicating 
one person, or fusing several people together. 

These kinds of radical surgery introduce ruptures in the causal connected- 
ness of the organism’s history. In fact, any medical intervention poses at least 
some threat to organic continuity, since the organism’s future functioning is no 
longer explained entirely in terms of its earlier functioning. We can salvage 
personal identity only by subtly shifting the relevant sortal, incorporating the 
practice of medicine as a normal part of our ecological niche. However, organ 
transplantation poses a serious challenge to this strategy, since we begin to lose 
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our grip on the definition of the system whose survival is being preserved by the 
medical intervention. It is not medicine per se that poses a challenge to personal 
identity, but medicine that is transformational, rather than merely restorative 
or ameliorative. 

It is difficult to say at exactly what point the challenge becomes insuperable, 
but I would argue that we are well over the line when we reach the transplan- 
tation of significant parts of the cerebrum. At that point, I think we must say 
that there are no survivors, and that the products of the operation are no longer 
human. They should be thought of as new, artificial persons. 


18.4.3 Star Trek Transporter Accidents 


In the TV/movie series Star Trek, there is a transporter device that apparently 
takes one apart, atom by atom, sends the structural information through space, 
and re-assembles a duplicate at the other end. From time to time, the trans- 
porter malfunctions, producing two or more duplicates of the original, with the 
consequent confusions. 

Prima facie, the teleporter is fatal to its customers. The functional continuity 
of the human organism is completely broken, with the teleporter intruding as a 
tertium quid. As in the case of medical intervention, it may be that a shift in the 
sortal, from human to human*, where the continuity of the history of human*s 
includes the normal operation of the teleporter, will provide substances whose 
identities can endure the teleportation. 

What then happens in the case of teleporter duplication? At this point, I 
think we have to bite the bullet and say that Spock (or whoever) is literally 
at, two different places, with incompatible properties, at the same time. I will 
explore some of the implications of this possibility in the next subsection. 


18.4.4 Time Loops and Other Anomalies 


Suppose Captain Kirk travels through a wormhole, ends up in the past, and 
meets an earlier version of himself. How many humans are standing on the 
bridge? Do we count Captain Kirk twice? 

In my account of the attribution of properties to substances, I made these 
attributions true or false, not at a moment of time, but at a particular situation- 
token. Normally, one organism will not be at two places at one time, since this 
can only happen in the presence of a temporal anomaly: normally, contem- 
poraneous, spatially separated tokens are causally isolated from one another. 
However, time travel (through wormholes or whatever) can make very abnor- 
mal things possible. We must say not only that Kirk was thin then but heavy-set 
now, but also that Kirk is now bewildered over there, but not now bewildered 
over here. Incompatible attributions are logically consistent, even when they 
are attributed to the same substance at the same time, so long as they are at- 
tributed to different tokens at different places, and so long as these tokens are, 
through some anomaly, causally related. 
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When counting the number of people on the bridge, we should count Kirk 
only once. Since Kirk now acts in many ways as though he were (per impossi- 
bile} two persons, it may be pragmatically convenient to miscount for certain 
purposes. In addition, Kirk can have a number of relations to himself that are 
normally possible only between distinct people. He can, for example, clap his 
right hand against itself and generate noise, should a Zen-like mood pass over 
him. Although Kirk has only two hundred pounds in mass, he can stand on a 
scale and make it register four hundred pounds in weight (so long as he cooper- 
ates with himself in climbing on the scale). If temporal anomalies were common, 
we would have to be a lot more careful how we formulated various physical laws 
involving substances. 

What happens in the case of situations, like that of t, the present time, that 
include both versions of Kirk? Must we say that this token supports the con- 
tradictory type Kirk is angry and not angry? No, once again we must recognize 
that the types Kirk is angry and Kirk is not angry are not mereologically per- 
sistent. If we want to use only persistent types, we would have to say that Kirk 
is angry somewhere and not angry somewhere, which is not self-contradictory. 


18.4.5 Substrate Theory 


I have to confess to having qualms about the reduction of substances to situation 
histories that I have proposed in this chapter. Especially in the case of personal 
identity, I share the widespread sense of uneasiness with the idea that personal 
identity consists in nothing above and beyond causal connections of the right 
kind between distinct situations. I can feel the pull toward some postulation of 
a single entity that lies somehow at the bottom of personal identity. Let us call 
such a unifying entity the substrate of the person or organism. 

If such substrates exist, they might be situation-tokens of a special kind. 
These substrate-tokens would be timeless and non-spatial, but they would be 
causally efficacious. The substrate of person X might be a token that is causally 
responsible for the continuity and perpetuation of the personal substance history 
of X. In other words, the substrate would pull some real weight in the causal 
structure of the world, explaining the coherency over time of certain mental or 
even physiological processes. 

If living organisms in general have substrates, they could be thought of as 
an individual élan vital associated with each organism, a causally necessary 
condition of the sustaining of the organism’s biological functions. 

Although I can see some attraction to such a substrate theory, it would seem 
to be highly speculative. A good case for such a theory would have to make 
good on at least one of the following claims: 


e The existence of substrates is a deliverance of common sense. 
e We have direct awareness of at least one substrate (perhaps one’s self). 


e The existence of substrates can be supported as an inference to the best 
explanation of some phenomenon. 
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However, these claims seem at best doubtful. Probably the strongest case for 
substrate theory would make use of resilient intuitions we have about personal 
identity. 


18.4.6 Holes 


An enduring substance need not be composed of anything. Electron holes in 
semiconductor media and real holes (containing nothing but a vacuum) in ma- 
terials are two examples. We can trace the history of an electron hole (the 
absence of an electron from a positively ionized atom) as it moves through the 
semiconductor. The hole moves as electrons jump from its next position to its 
current position. The chain of electron-absences is a causally connected history 
of the appropriate kind; so, the hole counts as an enduring substance. Simi- 
larly, a hole in a material substance endures from one moment to the next by 
the same principles of inertia that explain the endurance of the isolated quantity 
of material stuff. 


18.5 Quantum Reality and the Foundations 
of Materialism 


At a bare minimum, materialism entails that the only things that can be causally 
efficacious are things that have spatial and temporal location. (See for example, 
David Charles’s discussion of physicalism (Charles, 1992, p. 280).) What is 
so special about spatiotemporal location? Why should that be thought to be 
essential to causal efficacy? There are two sorts of answers the materialist can 
give to these questions. On the one hand, the materialist could argue that we 
have no positive evidence that atemporal causation exists, that situations with- 
out spatiotemporal location are ever causally efficacious. On the other hand, 
the materialist could argue that the success in the history of science of a par- 
ticular explanatory strategy, one that presupposes the spatiotemporal character 
of causation, provides a powerful argument in favor of minimal materialism. 

It is one of the main burdens of this book to argue that the first materialist 
response is mistaken — that we have in fact ample evidence of the existence of 
atemporal causal connections.! First of all, the existence of teleological connec- 
tions, attested to throughout the biological and human sciences, points to the 
causal efficacy of certain causal and modal facts (section 7.4 and chapter 12). 
Second, the existence of logical and mathematical cognition and knowledge is 
best explained by positing the causal efficacy of facts about logical necessity 
(section 7.3 and Chapter 15). Third, the possibility of theoretical cognition and 
knowledge in science requires the existence of an atemporal causal explanation 
of the relative simplicity of causal mechanisms discovered across a wide variety 
of disciplines (section 17.5). Fourth, I argued in chapter 8 that a reasonable 
extrapolation of our success in discovering causes leads to the inferring of the 


1For a summary of these arguments, see section 21.3. 


250 Realism Regained 


existence of an uncaused first cause, without spatiotemporal location. The ex- 
istence of these independent evidences for extra- spatio-temporal causes means 
that the materialist cannot rely upon an appeal to ignorance. Positive reasons 
for limiting causation to spacetime must be given. 

The second materialist response does involve giving such a positive reason. 
The materialist can point to the progressive success of a particular model of sci- 
entific explanation, stretching continuously from Democritus to Einstein. I will 
call this the DTE model, for “Democritus to Einstein.” The model consists of 
a potentially fruitful strategy for explaining all natural phenomena whatsoever. 
This DTE model depends on the truth of four theses: 


1. The finite complexity of nature: Every phenomenon consists of a finite 
number of simple parts. 


2. Spatial compositionality: Facts about wholes supervene on intrinsic facts 
about their parts, together with the spatial relations between these parts. 


3. Every projectible correlation or other regularity has a causal explanation 
(Reichenbach’s rule). 


4. Causal locality: No action at a spatial or temporal distance. 


If these four theses are true, then we can hope to find a complete causal ex- 
planation of any observable phenomenon by following three steps: (1) analyze 
the phenomenon into its simple parts, (2) find complete causal explanations of 
each of the parts of this region in terms of contiguous facts, and (3) explain 
each feature of the whole region in terms of the features and spatial relations of 
the parts. Materialism entails that a Laplacean intelligence, if supplied by an 
oracle with all facts about the physical characteristics of the ultimate simples 
(unlimited measurement) and with all mathematical truths (unlimited compu- 
tation), would be able to explain all macroscopic properties and all projectible 
patterns and correlations. 

Both Democritean atomism and Einstein’s theory of general relativity, as 
well as many physical theories entertained between these two, fit this general 
materialistic strategy.? For example, general relativity tries to explain all prop- 
erties of complex physical objects in terms of the intensities of fields at the 
constituent spatiotemporal points and provides a deterministic theory of the 
evolution of these field strengths. 

If any of the four theses is denied, then the resulting view cannot be char- 
acterized as one of strict materialism. If, for example, we reject thesis (1), then 
we open the door to supra-physical influences. Lord Kelvin, as Crosbie Smith 
discusses in his biography of Kelvin (Smith (1989)), recognized that an infinitely 
complex nature opened the door to the influence of a supra-physical free will 


2 Although Newtonian mechanics violates principle 4 (locality), due to the instantaneous 
action of gravity, the inverse square law guarantees that any violations of locality due to 
changes in remote gravitational forces will be negligible. The gravitational attraction of mas- 
sive distant objects is an essentially uniform influence on all the particles of an isolated system. 
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on natural processes, without necessitating any violations of physical law. The 
operation of an infinitely intricate mechanism may be unpredictable in princi- 
ple, even if the basic physical laws are (for any finite system) as deterministic 
as classical Newtonian physics. 

Similarly, if we reject thesis (2), we admit the possibility of strongly emergent 
properties, properties whose existence and causal powers are unpredictable, even 
in principle, from the properties and spatiotemporal arrangements of the parts 
of its possessors. Such strongly emergent properties could be radically non- 
physical in nature and include mental or spiritual qualities. 

If we reject thesis (3), we are admitting that there is more in heaven and earth 
than is dreamt of in our best causal theories. Reliable patterns and correlations 
exist that cannot be explained by the operation of any physical mechanism. 
This again could open the door to irreducibly immaterial facts and explanations, 
explanations involving interpretation and verstehen, for example. 

Finally, if we reject thesis (4), then we must agree that there is no connection 
between causality and spatiotemporal contiguity. Consequently, there is no way 
to exclude the possibility of the causal influence of entities with no spatiotem- 
poral location whatsoever. If causal locality is not even approximately true — 
if differential causal influence does not go to zero as distance increases — then 
there is no conflict with physical theory in supposing that there is influence 
from states at an infinite distance, i.e., from states outside of spacetime alto- 
gether. Moreover, rejecting thesis (4) moves us in the direction of an ontological 
monism, since the existence of non-local influences challenges the basis of the 
individuation of material bodies. Without the principle of locality, the universe 
would be a single, evolving unity, of which individual material bodies are merely 
partial manifestations. 

In this section, I want to raise some doubts about the second, third, and 
fourth of the assumptions of the materialistic strategy, namely, the conjunction 
of causal locality, spatial compositionality, and Reichenbach’s rule. 

These principles face a serious challenge from quantum mechanics, in par- 
ticular, from Bell’s theorem and the empirical confirmation of the violation of 
Bell’s inequalities. Bell’s results (see Mermin (1981) and van Fraassen (1982)) 
conclusively refute the conjunction of locality and compositionality, since they 
entail that either (1) each thing in the universe is causally non-localizable, or (2) 
that macroscopic objects (classical systems) are localizable, but their mereolog- 
ical parts are not. If we take option (1), then the very notion of spatiotemporal 
location is undermined. (As I argued in section 5.10.2, action at a distance is 
impossible, because distance is by definition that at which there is no action.) If 
instantaneous action at a distance were possible, then spatial compositionality 
would be of no value: one could reduce the present features of the whole to the 
present features of its parts, but this would not constrain the future states of 
the whole in any meaningful way. 

Option (2) corresponds to Heisenberg’s interpretation of quantum 
mechanics (Heisenberg (1958)). This interpretation has not been popular with 
philosophers, largely, I think, because of its blatant inconsistency with mereo- 
logical compositionality. According to Heisenberg, properties such as definite 
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position, velocity, and momentum are strongly emergent properties of classical, 
macroscopic systems, not fixed by the properties of the quantum-level, micro- 
scopic parts of those systems (which are characterized by probabilistic quantum 
wave functions, not by the classical properties of definite position and velocity). 
The emergentism of the Copenhagen interpretation leads to the so-called mea- 
surement problem: at exactly what point in a micro/macro interaction does the 
transition from quantum properties to classical properties take place? 

If we assume that what is fundamentally real is the causal and mereological 
structure of situation-tokens and their intrinsic types, then the superimposition 
of a metric spacetime upon the world is a matter of finding the most favorable 
balance between simplicity and empirical adequacy. It would be unreasonable to 
expect that empirical adequacy should always trump the issue of the simplicity 
of the spacetime geometry. The very simple geometries of a single time line 
and Euclidean space (in pre-relativistic physics) and of Minkowski spacetime 
(in general relativity) are viable only through ignoring the occasional misfit 
between these geometries and the actual structure of events. Consequently, 
the attribution of classical spatiotemporal properties to macroscopic objects 
always involves a certain amount of oversimplification. As the scale shrinks, the 
mismatch between classical geometries and the causal structure becomes greater 
and greater, until it breaks down entirely at the level of subatomic interactions 
(as revealed by the failures of the Bell inequalities). The boundary between 
the classical world and the quantum world is consequently a vague one, and the 
measurement problem admits of no precise solution. 

As Bohm and Hiley (Bohm and Hiley, 1993, p. 94) have argued, quantum 
mechanics is essentially about “the process of forming and dissolving wholes,” 
wholes for which mereological compositionality fails. Such wholes play no role 
in classical or relativistic mechanics. 

However, if we try to avoid the rejection of spatial compositionality that is 
explicit in the Heisenberg interpretation by trying to force definite spatiotem- 
poral properties and relations on the objects at the quantum level, the result is 
no better for materialism. The Bell inequality results force, by a kind of recoil, 
the non-locality of the quantum systems to rise to the level of the macroscopic. 
Without causal locality, the very notion of the location of a macroscopic sub- 
stance becomes problematic. We can no longer analyze spatiotemporal location 
in terms of causal relations, since there is no longer any correlation between 
causal relatedness and spatial distance. Location would have to be treated as 
a purely subjective phenomenon, simply a matter of how things appear to us. 
This would mean, moreover, the abandonment of even minimal materialism, 
since it presupposes the objectivity of location. 

Moreover, the limitation of causation to the spatiotemporal is plausible only 
if causal order agrees in every case with temporal order. If temporally reversed 


3In The Undivided Universe, David Bohm and Basil Hiley (Bohm and Hiley, 1993, p. 378) 
suggest just such an emergence theory of Cartesian space. They suggest that the difficulty of 
interpreting quantum mechanics is the result of “trying to force quantum laws into a framework 
of a Cartesian order that is really only suitable for classical mechanics.” 
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causation is acknowledged — as in Cramer’s interpretation of QM (Cramer 
(1986)) — it seems arbitrary to exclude atemporal causation. If the materialist 
tries to preserve compositionality by giving up causal locality and admitting 
superluminal influences, then the theory of relativity entails that he has also 
acknowledged temporally reversed causation, since, if no light signal can reach 
B from A, then in some reference frames A precedes B, and in other frames B 
precedes A. If A causally influences B, then there is a cause that is (in some 
frames of reference) later than its effect. 

On the Heisenberg interpretation, we can still interpret location as an ob- 
jective property of macroscopic systems. Moreover, we can continue to hold the 
principle of causal locality: two systems with spatial location (i.e., two macro- 
scopic systems) can influence one another directly only if they are spatially 
contiguous. Indirect influences via microscopic parts escape the constraints of 
locality, since the parts have no location, strictly speaking. However, under 
most ordinary conditions, such quantum-level effects are negligible. 

There are in fact three levels at which we encounter the primary qualities of 
position, shape, velocity, volume, and so on: 


1. The mereotopological level, the “commonsense,” qualitative geometry at- 
tributable to the network of causal relations. 


2. The metrical geometry of classical physical theory (including Newtonian 
and relativistic mechanics). 


3. The extrinsic functional properties of observable objects in the natural 
human environment that correspond to our perceptions of location, shape, 
and volume. 


At each level, we can talk about such things as relative distance. At the 
level of mereotopology, this talk reflects the causal relations actually holding 
between event-tokens. Spatiotemporal relations at this level accurately reflect 
the underlying causal realities (at both the macroscopic and the microscopic 
levels), but the relations have few formal properties and cannot sustain a simple 
metrical geometry. 

At the level of physical geometry, we sacrifice some accuracy and comprehen- 
siveness for the sake of finding a simple geometry with strong formal properties. 
Classical mechanics, by ignoring the microscopic level, can successfully impose 
a very elegant geometry on the network of causal relations at the macroscopic 
level. 

Finally, just as there are functional properties of observables corresponding 
to the secondary qualities of sensation, so are there functional properties cor- 
responding to the visual and tactile qualities of shape and location. Objects 
of certain types have the extrinsic function (as parts of the natural environ- 
ment of humans) of stimulating certain kinds of sensory representations in the 
human mind, representations corresponding to the primary qualities. At this 
level, Berkeley was quite right to insist that there is no essential difference be- 
tween primary and secondary qualities. The difference between the two kinds of 
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qualities consists in the fact that there are systems of properties roughly corre- 
sponding to the sensible primary qualities at the level of causal mereotopology 
and of physical theory, while there are no such properties corresponding to the 
secondary qualities. 


18.5.1 The Many Worlds Interpretation 


There is one interpretation of quantum mechanics that would seem to provide 
some hope for materialism: the many-worlds interpretation of Everett. and De- 
Witt. According to this interpretation, there is no collapse of the wave function. 
Instead, the superposition of eigenstates represented by the quantum wave func- 
tion corresponds to the existence of multiple states of the universe. Individual 
particles can have many different positions, momenta, and other characteristics 
at the same time. Since the evolution of the wave function is fully local, the 
many-worlds interpretation seems to preserve both locality and mereological 
compositionality. 

On the Everett interpretation, measurement brings about a splitting of the 
world into many worlds or many ‘relative states’ of the world. One fundamental 
question to be faced is this: is world-splitting a causal process or not? If it 
is not a causal process, then Reichenbach’s rule is violated, since splitting into 
coordinated states is used to explain observed correlations. However, if world- 
splitting is a causal process, then it is a very peculiar one. For example, the 
non-occurrence of an interaction can cause a world-splitting, such as our failure 
to observe an electron at one of the slits in the two-slit experiment (Bohm and 
Hiley, 1993, pp. 123-124). In effect, what is going on in other worlds or remote 
parts of this world can produce world-splitting in this world, a very radical sort 
of violation of locality. 

The greatest difficulty with the many-worlds interpretation, noted by Bell, 
Bohm and Hiley, van Fraassen (van Fraassen, 1987, p. 85), and many others, 
concerns the interpretation of quantum probabilities. Suppose that quantum 
theory predicts that a particular measurement has two possible results, one 
having a probability of 75%, the other 25%. According to the many-worlds 
interpretation, both observations will actually be made, each in a different world. 
What then does it mean to say that the first is three times more likely than the 
second? There is no non-circular answer to this question that can be given in 
terms of the standard many-worlds interpretation. 

Another problem for the standard many-worlds interpretation is that of spec- 
ifying a privileged basis for decomposing the quantum function into a plurality 
of precisely defined worlds or “relative states” (in Everett’s terminology). The 
wave function does not by itself determine which operators represent real prop- 
erties, properties that are fully and determinately realized in each of the many 
worlds. Thus, many-worlds theorists must supplement quantum mechanics with 
some sort of metaphysical principle for determining this basis. 

Albert and Loewer (1989) have devised a variant, the many-minds inter- 
pretation, that provides an intelligible meaning for quantum probabilities and 
defines a privileged basis for the decomposition of the wave function. Suppose 
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that each brain state of a certain kind is inhabited by infinitely many minds, all 
in exactly the same mental state, which state is determined by the underlying 
brain state. In the example given above, each mind makes a transition to one 
of the two observation states. Each mind has an objective chance of 75% of 
ending up in the first state and a 25% chance of ending up in the second state. 

The Albert /Loewer theory is an unusual mixture of physicalist and dualist 
elements. The mental state of each mind is wholly determined by the under- 
lying physical state. However, the continuity through time of each mind is 
wholly unrelated to any physical substratum. Albert and Loewer have solved 
the probability-interpretation problem only to create a still more serious prob- 
lem: the problem of accounting for the diachronic identity of individual minds. 
A causal explanation of mental identity through time is excluded, since such 
causal links would have to be independent of physics, and yet the synchronic 
state of each mind is wholly determined by the corresponding physical state. A 
spatiotemporal or corporeal basis for diachronic identity is obviously not avail- 
able, since each of the successor minds have exactly the same spatiotemporal 
relations as the infinitely many duplicate minds occupying the same physical 
state. 


19 


Eudaemonism and the 
Objectivity of Value 


The objective reality of value is a given of human experience. The task of phi- 
losophy is not to explain away the objectivity of value, but to reconcile the 
objective existence of value with four apparent problems. These four problems 
include two forms of “queerness” about objective values noted by J. L. Mackie, 
plus the problems of semantics and epistemology. Mackie argued that objective 
values would be queer, incapable of being integrated into a rationally coherent 
ontology, because, unlike all other facts, they (1) provide categorical reasons for 
acting, and (2) are intrinsically motivating. Many have noticed the philosoph- 
ical problems inherent in postulating objective values, since they seem to be 
imperceptible and without definite location. Finally, the most serious problem 
of all is the semantic one: given that there are objective values, how do our 
thoughts and words become linked to one rather than another? 

By bringing on board the theory of teleofunctions, it is possible to revive 
the eudaemonistic theory of objective value developed by Plato and Aristotle. 
This eudaemonistic theory is able to solve the four problems in a systematic 
and principled way. 


19.1 Objectified Subjectivity: A Dead End 


Is the distinction between a good life and a bad one an objective distinction, or 
does “thinking make it so”? This is the fundamental question of meta-ethics. I 
want to defend a robustly objectivist thesis, much more so, I think, than many 
others who are often associated with objective or naturalistic ethics. I do not 
want to identify the objective good with an idealized version of subjective good. 
That is, I do not identify the good with what a suitably idealized person would 
want or value. This sort of approach can achieve only pseudo-objective value, 
not the real thing. 

Proposals of objectified subjective value fall into two categories. First, there 


257 


258 Realism Regained 


are the intersubjective theories, such as that of Hume’s impartial spectator or 
Mill’s competent judges (in chapter 2 of Utilitarianism). Second, there are the 
individualized versions, such as those of Brandt or Railton. 

The intersubjective theories fall prey to two objections. First of all, the 
intersubjective approach can give us no good reason to believe that all idealized 
subjects will want the same thing. Even if it turned out that they did, this 
would be merely a fortunate accident and would still not secure the objectivity 
of value. Convergence of ideal inquirers on some opinion is evidence that the 
opinion is true, because the truth of the opinion is part of the best explanation of 
the convergence. The objective fact with which the opinion is concerned comes 
first, causally speaking: the convergence is causally dependent upon it. This is 
why we cannot simply identify the objective good with that value upon which 
idealized agents would converge, since the convergence cannot explain itself. 

Second, as my colleague T. K. Seung (1993) has persuasively argued, the 
definition of “ideal” as used in every version of the intersubjective approach 
always incorporates the theorist’s prior opinions about what is in fact good. 
The only sort of ideal agent that could be relevant to the project of grounding 
a theory of objective value would be an objectively ideal agent, and this means 
that we must have at least one objective value whose existence is not explained 
by intersubjective convergence. If at least this one, why not many more? An 
ideal agent is one whose cognitive faculties are fulfilling their proper functions. 
Why not identify an objectively good life with one in which some class of proper 
functions is fulfilled? 

The idealized self of Brandt (1979) and Railton (1986a) also fails as a basis 
for genuine objectivity. Brandt and Railton identify what is objectively good 
for me with what my ideal self would want, or what my ideal self would want 
me to want. As in the last case, there are at least potentially some problems 
with giving a value-neutral justification for one form of idealization. Railton ar- 
gues that, since he is trying to give an ontological account and not a conceptual 
analysis of value, the circularity of his theory is not vicious. In effect, Railton 
gives a recursive definition of value: my good is whatever my ideal self would 
want, where the characteristics of my ‘ideal’ self are also determined by what 
my ‘ideal’ self (in the same sense) would want. This is a coherent theory of 
idealized subjectivity. An objectivist would, with some plausibility, argue that 
the principle that value is coextensive with the preference of an ideal self is plau- 
sible only when the standards defining the ideal self are themselves objectively 
valid. However, the more serious problem for Railton’s account concerns the 
possibility of the sort of semantic reference to the good that Railton postulates. 

The ideal-self theorist faces a dilemma. She must either identify the property 
of being good with the property of being wanted by her ideal self, or she must 
identify that property with what Railton calls the “reduction basis” of that 
property, that is, with the conjunction of physical and psychological facts that 
make it the case that her idealized, fully informed self would want what it does. 
In either case, the ideal-self theorist cannot give a viable account of both the 
semantics and the epistemology of ethics. In particular, she cannot explain how 
the property of being good can be causally relevant to our experience. Railton 
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(Railton, 1986b, p. 142) explicitly takes such causal relevance of goodness to be 
a necessary condition of ethical objectivity: “it is such — and we are such — 
that we are able to interact with it, and this interaction exerts the relevant sort 
of shaping influence or control upon our perceptions, thought, and action.” 

In the first case, I think, the problems are fairly clear. It is hard to see how 
the property of being wanted by an idealized self could be causally efficacious in 
the real world, since it makes reference to the activity of a purely hypothetical 
being. If goodness is not causally efficacious, then facts about goodness cannot 
be involved in my forming opinions about my good, and we run afoul of the 
Gettier examples, being unable to distinguish knowledge from right opinion. In 
this case, ethical inquiry is not the investigation of a condition ontologically 
distinct from the result (under ideal conditions) of the inquiry itself. 

However, the second case is even more problematic.! The reduction basis of 
the good varies wildly from individual to individual, and from time to time, in 
the case of a given individual. The property of goodness must then be identified 
with an infinite disjunction of conjunctions of possible reduction bases with the 
corresponding fulfillment of the ideal desires (or equivalently, with an infinite 
conjunction of material conditionals of the form, if in physical state R,;, then 
in state G;). Such an infinite disjunction would be a paradigm of a genuinely 
disjunctive or gerrymandered type, which, as I argued in section 4.8.1, can never 
be causally relevant or efficacious, contrary to Railton’s intentions. 

Moreover, this means that our idea of the good must carry the corresponding 
infinitary information, and must have the robust carrying of this information 
as its proper function. However, it is surely impossible for us, given our finite 
capacities and finite evolutionary history, to have any proper function that is 
infinitely complex in nature. How could the fact that a state carries some 
infinitely complex information possibly contribute causally to the existence of 
humankind? The idea of the good has been involved in only finitely many events 
in the evolutionary history of mankind. It is impossible that every disjunct in 
the infinitary disjunction played a causal role in contributing to the successful 
propagation of the species. Hence, the representational content of the idea must 
be finitary. 

In addition, there is an even more fundamental objection to Railton’s ac- 
count. There is no reason to believe that there actually exists a disjunctive type 
that is equivalent to the higher-order type being desired by one’s ideal self, as 
I have already argued in section 16.1. There could have existed many physi- 
cal types that do not in fact exist. Only physical types that exist in actuality 
can be components of actually existing disjunctive types. Hence, no disjunctive 
type, not even an infinitary one, is equivalent in intension to the higher-order 
type. In some worlds, the physical type realizing the reduction basis for the 
ideal-self preferences will be a type that does not even exist in our world and 
that therefore cannot be included in any actual disjunctive type. 

Consequently, the objectification of subjective preference or choice is inad- 
equate as a basis for objective value. The theories of Hume, Mill, Brandt, and 


lit is this horn of the dilemma that Railton grasps (Railton, 1986a, p. 25). 
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Railton have value as accounts of the epistemology of ethics. Where they fall 
short is in accounting for its metaphysics: what is it that constitutes the reality 
of objective value? 


19.2 Eudaemonia 


The identification of good with the fulfillment of natural functions originates 
with Plato and is carried on in the eudaemonistic tradition of Aristotle, Aquinas, 
and Butler, to name a few. Eudaemonia is the state of living a perfectly good 
life. 

In order to define a good life in terms of teleofunctions, we must first distin- 
guish between primary and secondary functions. A secondary function is one 
whose proper operation presupposes the failure of some function. For example, 
the function of wound healing is secondary, since its fulfillment presupposes that 
the body has suffered some damage, preventing it from fulfilling one or more of 
its natural functions. The operation of antibodies is a secondary function, pre- 
supposing the presence of disease. Anger is a secondary function, presupposing 
that some function has been thwarted by the unjustified action of others. A 
primary function is any function that is not secondary. 

Clearly it would be a mistake to identify eudaemonia with the fulfillment of 
all of one’s functions. This is an impossible state, since the fulfillment of any 
secondary function presupposes the failure of some other function. Hence, I will 
define eudaemonia as the state in which all of an organism’s primary functions 
have been fulfilled. 

In chapter 12, I developed a moderate, Aristotelian account of teleofunction- 
ality. On the basis of that account, we would expect eudaemonia to consist in 
the fulfillment of a largely harmonious system of functions. Whether or not all 
of these functions can be explained in terms of their contribution to the repro- 
ductive fitness of humans (that is, in terms of natural selection) is an empirical, 
and not a conceptual question. 


19.3. The Connection between Eudaemonia 
and Motivation 


In Ethics: Inventing Right and Wrong (Mackie (1974)), J. L. Mackie argued 
against the existence of objective good by charging that such a thing would 
be “queer” in two ways: first, it would, by virtue of simply existing and being 
recognized as such, have an essential power of engaging anyone’s motivations. 
Second, it would, in the same way, have an essential power of providing anyone 
with reason for doing something, independent of his desires or tastes. With a 
eudaemonistic theory of objective good, one can provide an explanation of these 
two special powers that dispels Mackie’s charge of queerness. 

What is it that people really want? Is it that all of their present desires, as- 
pirations, and intentions be fulfilled? No, because it is possible that after having 
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fulfilled these, people can find themselves dissatisfied, even miserable, coming 
to realize that their desires were faulty, erroneous, and in need of revision. 

Is the goal of all action then the feeling of satisfaction and contentment, or 
attendant feelings of pleasure and ease? Again, this is clearly wrong. If one 
were offered a pill that would guarantee a lifetime of warm and fuzzy feelings, 
but would also guarantee that one would remain idle, ignorant, and friendless, it 
would be foolish to accept such a pill. The feelings of satisfaction and pleasure 
are reliable but fallible indicators of what it is we really want. 

What, from an anthropological point of view, is the function of our will, of 
our ability to want and to desire? It must be to coordinate our actions so that 
we have the greatest chance of fulfilling all of our functions, including digestion, 
metabolism, and reproduction. The point of having a will is to aim at eudae- 
monia. Desires and feelings of pain, pleasure, contentment, and dissatisfaction 
are all mechanisms whose proper function it is to move us in the right direction. 

There is a non-vicious circularity built into the object of our will. We aim 
at the fulfillment of all our functions, including the function of the will (which 
is to aim at the fulfillment of all our functions). In other words, the content of 
eudaemonia for reflective organisms like humans must be specified recursively. 
Kant was wrong to contend that the only good thing is a good will, since in that 
case there would be no base case with which to start the induction. However, 
he was right to include a good will as a part of the good. 

There is an ineluctability to our wanting eudaemonia. We do not choose 
to aim for it, nor could we choose not to. Eudaemonia is the end for which 
sake the faculty of choosing exists. When the will, the faculty for planning and 
choosing, is functioning properly, it is aiming at eudaemonia. Humans can vary 
in the degree to which they understand what eudaemonia consists in and how 
best to achieve it, but all humans want eudaemonia as their sole ultimate end. 

Although eudaemonia is an ineluctable end for normal human beings, the 
connection between eudaemonia and human motivation is not an unbreakable 
one. It is possible to be in the grip of defective desires: desires for conditions that 
are destructive of eudaemonia, even desires that are known to be destructive of 
one’s fulfillment. For example, an addict may find himself with an overwhelming 
desire for cocaine, despite the fact that the addict is well aware of the fact that 
cocaine is not good for him. 

However, it is too much to expect a theory of objective good to support an 
unbreakable connection between objective good and human desire. It is enough 
if it supports a natural and essential connection between the two, and that 
eudaemonism can supply. 

It is also important to realize that eudaemonia is an inclusive, and not a 
superordinate, end. I am not claiming that everything we want we want as a 
means to the end of eudaemonia. Rather, I am claiming that all of our natural 
desires are desires either for a constituent of eudaemonia or for something that 
tends to contribute to eudaemonia. 

Hume famously argued that reason is the “slave of the passions,” that reason 
uncovers only facts, and that facts cannot motivate without the cooperation of 
desire. A eudaemonist does not entirely disagree with Hume. The goodness of 
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eudaemonia is not entirely independent of the fact that we all (qua humans) 
desire it. The fact that eudaemonia is good for us depends in part on the 
structure of our faculty of desire. 

Indeed, Hume recognized two functions of reason with respect to the good: 
the verification of the real existence of the proper object of a passion, and the 
selection of adequate means to the attainment of that object (Hume, 1969, 
p. 463, book II, part III, section 3). Thus, reason’s function is not wholly 
instrumental: it must also apprehend the proper object of the subject’s passion. 
The crucial issue that divides the Humean from the eudaemonist is this: are 
the proper objects of our passions invariably specifiable in ethically neutral 
language? If so, there is no room for a normative science of ethics. Reason must 
simply introspect and discover the ethically neutral objects of our passions and 
then apply itself to selecting the most effective means for securing those objects. 

However, suppose that some of our most important passions or desires have, 
as their proper objects, conditions that can only be specified in ethical terms, 
such as wisdom, virtue, or true happiness? Then reason’s non-instrumental 
function in guiding action is not merely one of introspection; it also includes the 
science of ethics, an investigation into what it is we truly want. 

For Hume, there is such a thing as the science of ethics, consisting in the 
discovery of those things that human beings invariably desire and approve. How- 
ever, this science guides our actions only instrumentally, by informing us about 
how we and other humans are likely to act. For the Aristotelian eudaemonist, 
ethics is action-guiding in a more direct way: it clarifies for us the nature of our 
end, which is not (contra Hume) transparent to introspection. 

Mackie’s second charge of queerness concerned the power of objective good 
to provide reasons for action. Mackie followed Hume in adopting a purely 
instrumental conception of reason: the faculty of knowledge has as its purpose 
the selection of means, not of ends. However, on a teleological conception of 
human nature, we have access to a substantive conception of reason. Knowledge 
includes knowing what to want, as well as how to get what one wants. 

The qualification for participating in rational dialogue is the possession of 
properly functioning cognitive faculties. These cognitive faculties include our 
wantings, valuings, and preferrings. One whose mind is diseased in its cona- 
tive faculties is as much disqualified from full membership in the institution of 
reason-giving and reason-taking as is one whose capacities for deductive logic 
are damaged. 

In response to eudaemonism, Mackie argues that the theory depends on a 
confusion of two possible senses of the phrase, the good for man. This could 
mean either (1) “what men in fact pursue or will find ultimately satisfying, ” or 
(2) “man’s proper end, ... what he ought to be striving after, whether he is or 
not” (Mackie, 1977, pp. 46-47). An account of eudaemonia in the first sense is 
a descriptive statement, with no implications for normative ethics. An account 


2 At the moment, I’m assuming that all deviations from reason are by way of a defect. ’'m 
setting aside the possibility of the teleological suspension, for the individual, of the ethical 
and the rational, a possibility I will take up in section 20.8. 
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of eudaemonia in the second sense already presupposes some positive theory of 
the good, and so cannot inform ethics. 

Clearly, an Aristotelian must insist that he uses the word eudaemonia, or 
the phrase the good for man, in the first, descriptive sense. It makes no sense to 
say that we ought to pursue eudaemonia, since we must do so, and oughts only 
apply to things we both can do and can fail to do. However, Mackie simply 
begs the question by claiming that an investigation of eudaemonia in the first 
sense has no implications for normative ethics. Mackie assumes here the very 
descriptive/normative dichotomy that is in dispute. 


19.4 Nature and Nurture 


The nature of an organism (in the normative sense of the word) is constituted 
by the totality of the functional organization of the organism as situated in 
its environment. Nature in this sense cannot be identified with the organism’s 
genetic endowment, nor can it be assumed to be utterly disjoint from the effects 
of the organism’s social and cultural environment. It is natural for human 
beings to have parents, to learn a language, to acquire a history and a network 
of social relationships, no less natural than to eat and drink, or to have a heart 
that circulates the blood. Socialization and acculturation are themselves natural 
processes, processes with natural teleofunctionality. 

A human organism’s capacity to exercise its functions can be damaged by a 
bad upbringing, as much as by defective genes. Ethics, as a part of dialectic, 
is addressed to humans whose rational faculties are in good working order. 
Hence, a reasonably good upbringing is a prerequisite for full participation in 
the practice of ethics as a science. 


19.5 The Unity and Universality of Good 


In the Summa Theologicae, Aquinas asks two critically important questions: 
e Does every human have a single ultimate good? 
e Do all human beings have the same ultimate good? 


Eudaemonism is committed to answering “Yes, for the most part” to both 
questions. The unity of the human organism depends on the harmony, the 
mutual compatibility and even interdependence, of that organism’s primary 
teleofunctions. To the extent that these functions are compatible, eudaemonia 
constitutes a coherent system of ends. 

If the system of ends of a human being becomes radically discordant, falling 
into two or more mutually exclusive sets, then the unity of personality has disin- 
tegrated, resulting in incoherent and mutually adversive actions. The science of 
ethics presupposes that its participants are rational, cognitively healthy individ- 
uals. Hence, ethics presupposes that each has a single, approximately coherent 
end. 
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There is a special branch of ethics, which we might call “corrective ethics,” 
that deals with the situation of persons who fall far from the ideal of rational 
coherency. This branch is concerned with the identification and study of sec- 
ondary functions, whose operation presupposes the existence of some degree of 
irrationality or personal disintegration. The study of such phenomena as guilt, 
shame, remorse, and regret would fall within this branch. 

Do all human beings share the same ultimate end? To the extent that all tele- 
ofunctions are explained by means of Darwinian selection, then this must be so, 
at least approximately. All humans share a common evolutionary background, 
and nearly all of the functional aspects of the human personality were fixed in 
these prehistoric times. Many of these natural teleofunctions are schematic or 
abstract, in the sense that their concrete realization depends heavily on his- 
torical and cultural contingencies. For example, every human has the natural 
function of speaking a language, but there is no particular language which it is 
natural for all (or even some) humans to speak. It is natural for every human 
to have a conception of his family’s history, but there is no particular historical 
conception that it is natural to have. It is this schematic character of human 
nature that makes possible the wide variety of culture across human societies 
and periods of history. 

These underlying anthropological functions provide a standard for evaluating 
various cultures. We can ask how well some aspect of a human culture fulfills the 
abstract teleofunction it serves. We can evaluate, for example, how well a given 
language fulfills the teleofunctions of language: how well it avoids ambiguity and 
confusion, how efficiently it conveys important information, how aesthetically 
pleasing are its sounds and cadences. The ethical norms and counsels of a culture 
can be subjected to evaluation on similar grounds. Eudaemonia is impossible 
apart from the resources that culture provides, but some cultures do a better 
job of this than others. 

Can members of certain nations acquire entirely new functions in the course 
of cultural evolution? Certain patterns of behavior could become fixed on a 
widespread scale in a society because that behavior promoted some effect valued 
only in that society, for special, historical reasons. Do these cultural functions 
become incorporated into a specialized, historically conditioned form of eudae- 
monia? The answer depends on the extent to which the fulfillment of these 
functions becomes incorporated into the function of the will itself. 

My hunch is that the human mind is so constructed as to resist the incor- 
poration of novel elements into the constitution of eudaemonia. The coherency 
of human action, or, equivalently, the unity of the human personality, is biolog- 
ically imperative. A human who is constantly working at cross-purposes enjoys 
less success in survival and reproduction than one whose personality is directed 
to a single, coherent end. The extent to which the fundamental orientation of 
the will is subject to change in history is a matter for empirical investigation: 
it may be that, as long as the new elements can be readily harmonized with the 
old, the will is subject to a limited degree of historical malleability. 

As we saw in chapter 12 (section 6), natural kinds must be identified at the 
level of populations and not individuals. There can be natural sub-types within 
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a given natural kind, varying in some degree in the proper functions that individ- 
uals in each sub-type realize. The most obvious example of this is the division 
of the two sexes. There may be other natural sub-types within humankind: 
a natural pattern of distribution of talents that promotes a socially advanta- 
geous division of labor. To the extent that humans do fall into such distinct 
natural sub-types, their natural ends will also vary. Ultimately, eudaemonia 
is something that can be fully realized only at the level of complete societies 
(as Aristotle himself at least partially realized). Nonetheless, there must be a 
limit to this variation among sub-types, if humankind is to constitute a single 
natural kind, and not merely a cluster of symbiotic kinds. There is good reason 
to doubt that the variation is so wide as to include humans who are slaves by 
nature (as in Aristotle’s account in the Politics?) or naturally decadent (as in 
Nietzsche’s view). 


19.6 Indeterminacy and Objectivity 


In his work on anti-realism, Michael Dummett suggested that the defining char- 
acteristic of a realist is her commitment to bivalence: to the thesis that every 
proposition in the relevant domain is either true or false, and not both. How- 
ever, as a number of critics have pointed out, there is no necessary connection 
between this criterion and the requirement of ontological commitment. A re- 
alist is one who believes that there is some reality, potentially exceeding in its 
richness our capacities to investigate it, that determines which of our proposi- 
tions in the relevant domain are true and which are false. If realism is true, 
then when we agree about some proposition, on the basis of shared knowledge, 
the fact about which we are agreeing plays an indispensable role in the causal 
explanation of our agreement. 

Realism in this sense (the indispensability of reference to corresponding facts 
in fashioning complete causal explanations of our cognitive practices) is fully 
compatible with a failure of bivalence. A proposition of a realistic domain can 
fail to be either true or false, if the corresponding set of facts is itself incomplete. 
Indeed, it is possible to be a realist while accepting that certain propositions in 
the domain are both true and false, so long as there exist facts in a suitable state 
of overdetermination. T. K. Seung and I (Seung and Koons (1997)) have argued 
for just such a realistic overdetermination of fact in the domain of ethics. I would 
argue that the law of non-contradiction is a highly reliable, and yet fallible, rule. 

There is another dimension of indeterminacy in ethics that should be ac- 
knowledged. The existence of an objective ideal of eudaemonia entails only the 
existence of a partial preference ordering of states, based on the set of primary 
functions that are fulfilled in each state. One state is clearly to be preferred to 
a second if the set of functions fulfilled in the first is a superset of the set of 
those fulfilled in the second. Many states will be mutually non-comparable, as 
Isaiah Berlin argued. 


3See the discussion of slavery and natural rights in (Arnhart (1998)). 
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When we find ourselves in a state in which it is impossible to fulfill all of our 
primary functions, certain secondary functions are activated. These secondary 
functions guide decision making in sub-optimal conditions. The proper opera- 
tion of these secondary functions, when risk is involved, provides a normative 
standard from which we can construct a cardinal utility or welfare function. 
Presumably, this welfare function will be highly underdetermined by the set of 
secondary decision-making functions. We could model such an underdetermined 
function by a set of acceptable functions, or by a single, interval-valued function. 


19.7 The Semantics and Epistemology 
of Ethics 


Since practical reason has guiding us toward eudaemonia as its principal proper 
function, the exercise of practical reason is a reliable means of achieving eu- 
daemonia. Insofar as our practical judgments are the product of healthy, well- 
functioning cognitive faculties, they will tend to be accurate. 

All of our thoughts are causally connected to the property of eudaemonia, 
since eudaemonia is the natural end to which our faculties are ordered. This 
means that our concept of goodness will have an exceptional semantic basis. 
We can distinguish the concept of the good from other concepts by virtue of its 
cognitive-functional role. Characterizing an act or state as good is to qualify it 
as a possible goal for action. Characterizing an act as best, all things considered, 
is tantamount to adopting the act as an intention. 

Among the sources of knowledge of the good, the first and most important 
is personal ethical development. One grows in one’s knowledge of the good as 
one’s cognitive faculties develop and mature. Since this intellectual maturity 
is itself a great good, growth in the knowledge of the good depends on success 
in achieving the good. As one’s capacities for the exercise of one’s functions 
improves, this brings with it a fuller expression of one’s cognitive and conative 
faculties. This ethical development depends in large part on a good upbringing. 
The example of mature character in one’s elders and instruction through story 
and precept both play an indispensable role. 

A second important source of ethical knowledge is the testimony of the wise, 
of persons who have already achieved a high level of developmental maturity 
and experience. There is a certain degree of unavoidable circularity in appealing 
to the wise, since the ability to recognize wisdom is itself a product of ethical 
knowledge. Nonetheless, this circularity can lead to a beneficial recursion, rather 
than vicious stagnation. 

Thirdly, the experiences of pleasure and displeasure, contentment and dissat- 
isfaction, and other subjective impressions of one’s degree of welfare are natural 
indicators of eudaemonia, reliable although fallible. 

Fourthly, there is the possibility of scientific knowledge of eudaemonia, based 
on the empirical investigation of the teleological generalizations and connections 
associated with human nature. Medicine and physiology can clearly give us 
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insight into the nature of health, a key component of eudaemonia. Sociobiology 
and evolutionary psychology can reveal which aspects of human behavior and 
social organization have been adaptive. 


19.8 Eudaemonism versus Evolutionary 
Ethics 


It is important to distinguish teleological eudaemonism from the various versions 
of evolutionary ethics that have been proposed since the time of Darwin. I will 
group evolutionary ethics into three categories: (1) the Whig interpretation of 
evolution, (2) nature as a moral paradigm, and (3) survival value as the ultimate 
value. 

There are elements of the Whig interpretation of evolution in Herbert Spencer, 
but I think it has been more influential in popular culture than among scientists 
or philosophers. The WIE is implicit in the first series of Star Trek, where the 
characters are constantly discussing whether or not some alien species is ‘more 
evolved’ than they are. What I mean by WIE is the view that evolution has 
a determinate direction, an absolute Up and Down, with human beings on the 
top (at least so far). Earlier forms of life are early stages of a process with a 
predetermined terminus, namely, civilized Homo sapiens (the Victorian English- 
man), or, perhaps, some form of Uebermensch just over the horizon. The WIE 
is not incompatible with teleological eudaemonism, but the two are logically 
independent. The kind of cosmic teleology implied by the WIE is not part of 
the immanent teleology of human functioning required by eudaemonism. 

The Whig interpretation of evolution is in conflict with eudaemonism if it 
asserts that higher forms of life are better, and that we have some sort of obli- 
gation to further the process of evolution. Eudaemonism is essentially conser- 
vative, backward looking: what determines eudaemonia for us is what functions 
have been realized so far in human nature, not what new functions may arise in 
new forms of life in the future. 

A number of ethicists have proposed that nature, as described by Darwin- 
ism or evolutionary biology, be taken as an ethical paradigm to be imitated. 
Some have urged that since natural selection exists, it must be right, so it is 
right for us to be tough-minded and allow inferior humans to perish. Dewey, 
believing that evolution is a Good Thing, adopts flexibility and mutability as 
the ultimate values of human life. All of these forms of evolutionary ethics are 
guilty of a fairly crude version of the “naturalistic fallacy”: taking whatever 
exists (on a large and permanent enough scale) to be ethically paradigmatic. 
Teleological eudaemonism is committed to no such inference. The measure for 
human happiness is human nature, not some cosmic phenomenon. We are to 
be fulfillers of human nature, not imitators of nature. 

Some evolutionary ethicists, such as B. F. Skinner, have taken survival, or 
the survival or reproduction of human genes, to be the ultimate value, in terms 
of which all other projects and intentions are to be evaluated. According to 
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this view, survival value is the only value, objectively speaking. Knowledge, 
friendship, music, and creativity are all valuable, if at all, only as means to 
survival. 

Survivalism of this sort is not compatible with eudaemonism. According 
to eudaemonism, one’s survival and reproduction are real values, since most if 
not all of our functions have these as natural ends, but these are not the only 
values. All of the components of eudaemonia, including survival, friendship, 
virtue, productive work, and knowledge, are equally ultimate. We could imagine 
a creature so constructed by nature that each of its choices was made in light 
of their reproductive expected value, but human beings are clearly not such 
creatures. Nor is it clear that such creatures would be better at reproducing 
themselves than we are: it is often the case that one’s chances of achieving X 
are increased if one forgets about X and seeks Y instead. 

I might add here that, contrary to the views of genetic imperialists such 
as Richard Dawkins, there is no value to the reproduction of one’s genes per 
se, nor do genes have self-reproduction per se as their natural function. For 
example, suppose a wealthy industrialist decides against having children, but 
instead builds several large factories producing tons of DNA that duplicates his 
own genes. He has these frozen, loaded into canisters, and launched into deep 
space. Have the industrialist’s genes fulfilled their natural functions? Far from 
it. The natural end of reproduction is the reproduction of a form of life, not of 
a quantity of chemical compound. 

Survivalism is based on a confusion between survival’s being the ultimate 
function of some feature and survival’s being the ultimate value for human 
choice. If Darwinism is true, then all functions can be explained in terms of 
natural selection. This entails that all functional features have, as their ultimate 
function, the function of contributing to the reproduction of some biological 
kind. The heart has the function of pumping blood, and it also has the ultimate 
function of enabling us to survive and reproduce. However, the existence of this 
ultimate function does not annihilate the reality of the proximate function. It 
would be wrong to say that the function of the heart is to ensure reproduction, 
and not to pump the blood. Similarly, our capacity to love our families and 
friends presumably enhances our reproductive fitness, but these functions are 
fulfilled only when we really do love our friends and families, and not when 
we cynically use them to maximize our chances for reproduction. To enjoy 
eudaemonia, we must fulfill our proximate functions as well as our ultimate 
ones, and the fulfillment of proximate functions can be, from the perspective of 
choice, every bit as much an ultimate value as is reproduction itself. 


19.9 Moore and the Indefinability of Good 


In Principia Ethica (Moore (1922)), G. E. Moore argues that our idea or notion 
of good is indefinable. He applies a very high standard for a correct definition of a 
concept: the definiens must be inter-substitutable salve veritate in any cognitive 
context. If the good were definable, then the definition must have exactly the 
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same cognitive significance as the concept of good itself. Moore suggests that 
the idea of a horse is definable: that there is some ensemble of descriptions of 
parts and their properties and relations that would be cognitively equivalent to 
the concept horse. This seems doubtful. Given any such complex description, it 
seems one could always meaningfully ask, yes, but is every such thing a horse? 
It would seem that Moore’s ‘Open Question’ criterion sets an impossibly high 
standard. 

Moore’s account of the naturalistic fallacy depended on a particular theory 
of intentionality with respect to properties or qualities. Moore believed that, 
whenever we think of a quality, one of two conditions must be met: (1) we have 
direct acquaintance with that quality, or (2) we have a definition of the quality 
in terms of qualities with which we have direct acquaintance. Moore assumed 
that knowledge by acquaintance consists in having the quality itself (in toto) 
immediately present to the mind, perhaps, via the presence of a mental datum 
instantiating the quality, or perhaps via the bare presence of the uninstantiated 
universal. In either case, Moore’s argument depends on assuming that the 
presence-of-x-in-the-mind context is a fully extensional context. In other words, 
if x = y, and z is fully present to the mind, then so is y. 

Let M(Q) represent the mental state-type of having the quality Q present to 
the mind. Moore’s assumption (which we may call the extensionalist fallacy) is 
that if P = Q, then M(P) = M(Q). This then provides us with a clear criterion 
for the distinctness or non-identity of qualities. To prove that P 4 Q, all we 
need to do is prove that M(P) 4 M(Q). How do we establish the non-identity 
of mental state-types such as M(Q) and M(P)? The observation of a difference 
between the two state-types via introspection would be sufficient. Thus, if we 
can distinguish M(P) from M(Q) in introspection, this is sufficient (for Moore) 
to establish that P 4 Q. It is relatively easy for Moore to then establish the 
indefinability of good, since the simple idea of good is easily distinguished, in 
introspection, from any proposed definiens. 

Moore’s theory of intentionality is not an especially attractive one. The 
notion of the bare presence of a quality to the mind is a fairly mysterious one. 
The account of intentionality developed in chapter 14 avoids such mysterious 
postulations. 

In any event, why should we suppose that the presence-of-x context is an 
extensional one? Couldn’t a quality be present to the mind, but always in some 
guise or under some mode of presentation or other? Couldn’t there be two dis- 
tinct intentional state-types, M, and Mo, each of which had, as its immediate 
object, the same quality Q? If so, demonstrating the distinctness of M, and 
Mo (where, say Mj, involves a simple notion while Mz involves a complex one) 
would have no bearing on whether they denoted the same quality. Hence, in- 
trospection alone provides no criterion for discovering the ontological simplicity 
or complexity of real qualities. 

If the Aristotelian account of teleofunctions developed in chapter 12 is cor- 
rect, then goodness is definable (complex). The primary use of good would be in 
specifying what is good for this or that organism, and something would be good 
for an organism just in case it is or leads to the fulfillment of the organism’s 
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teleofunctions. On the Wrightian account, the teleofunctions of an organism 
can be specified purely in causal terms, without reference to what is good for 
the organism. Hence, in this case, a non-circular account of the nature of good 
could be given. The definiens would not be cognitively equivalent to the concept 
of goodness, but it would tell us all there is to know about what goodness is, as 
a feature of the world. 


20 


Moral Theory as the 
Teleology of Character 


20.1 Virtue as Both Means and End 


Moral virtue is the disposition to fulfill those of one’s teleofunctions concerned 
with choice and intentional action. As such, the exercise of moral virtue is an 
important component of eudaemonia. The primary teleofunctions of the will 
are included in the set of primary teleofunctions whose fulfillment constitutes 
eudaemonia. 

At the same time, the exercise of moral virtue is a reliable means for the at- 
tainment of the whole of eudaemonia. The system of primary teleofunctions of 
a human being forms an approximately coherent whole: fulfilling any subset of 
the primary teleofunctions assists in (or at least, does not detract from) the ful- 
fillment of the others. Moral virtue is no exception. Although acting virtuously 
may involve some short-term cost in terms of the fulfillment of other functions, 
and although in exceptional cases this cost may be permanent and large, still, 
for the most part, acting virtuously leads in the long term to something close 
to complete eudaemonia. 


20.2 Eudaemonism versus Egoism 


Although each organism has, as its ultimate end, the achievement of its own 
state of eudaemonia, Aristotelian eudaemonism! is not a form of egoism (either 
ethical or psychological), since one human being’s eudaemonia always includes 
the fulfilment of the eudaemonia of certain others as one of its most important 
constituents. A parent’s primary functions can be fulfilled only if her children 
live and flourish. The primary functions of a human being include the capacity 
for true friendship (in the sense of Aristotle’s Nicomachean Ethics, Book X). 


1In contrast to hedonistic forms of eudaemonism, such as that of Epicurus. 
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The parent does not use the child’s eudaemonia as a means to the end of her 
own eudaemonia: instead, the child’s eudaemonia is an ultimate end for the 
parent, as part of the parent’s eudaemonia. Similarly, true friends do not use 
each other as means. Instead, the constitution of eudaemonia for each friend 
expands to include the eudaemonia of the other. 


20.3 Is and Ought 


As Kant proposed, morality contains categorical imperatives. On a teleological 
account, these imperatives are rooted in our human nature. Every human being, 
as such, is ordered to the fulfillment of the human form of eudaemonia as the 
ultimate end, and it is the ineluctability of eudaemonia as our end, together with 
the inclusion of moral action within eudaemonia, that gives moral requirements 
their categorical nature. 

We can, therefore, derive ‘ought’ from ‘is’, so long as the ‘is’ includes the 
specification of the teleological structure of the agent. Oughts are, in fact, a 
part of what is. For example, consider the following argument: 


1. Human beings necessarily pursue eudaemonia (this pursuit is constitutive 
of their being human, and so of their being simpliciter). 


Mary is a human being. 
Therefore, Mary necessarily pursues eudaemonia. 


Moral wisdom is a necessary condition of attaining eudaemonia. 


oe wos 


Therefore, Mary ought to attain moral wisdom. 


The premises are all factual. The conclusion involves an all-in, categorical 
sense of ‘ought’, since none of the premises concerns contingent facts about 
Mary’s peculiar goals or interests.” Instead, they are concerned only with the 
concern that is constitutive of Mary’s very being, namely, the pursuit of eudae- 
monia. 

Am I not assuming, without basis, that we ought to fulfill our human nature, 
or that we ought to aim at so doing? I am asserting that we ought to fulfill our 
human nature, but not that we ought to aim at so doing. That we ought to fulfill 
our natures is tautologous. ‘Oughts’ apply only to things with teleofunctions, 
and something ought to be the case for such an organism just in case it involves 
the fulfillment of as many of the primary functions of the organism as possible. 

It makes sense to say that one ought to do something only if it is possible 
not to do so. We cannot say that we ought to travel slower than the speed of 
life, if it is impossible for us to do otherwise. Similarly, since it is impossible 
for us not to aim at eudaemonia, it makes no sense to say that we ought to 


?Perhaps this is not all that Kant intended in his use of the word categorical. I am 
not claiming that the practical necessities of achieving eudaemonia are Kantian categorical 
imperatives, only that they possess all the categoricality that is really needed in moral ‘oughts’. 
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aim at it. Since it is possible for humans to be confused or ignorant about 
what constitutes eudaemonia, it does make sense to say that we ought to have 
eudaemonia (or more precisely, an accurate representation of eudaemonia) in 
mind. We ought to make our decisions in light of an explicit and accurate 
representation of eudaemonia as a goal. In other words, we ought to act with 
practical wisdom. 


20.4 Sociobiology, Game Theory, and 
Species Relativity 


The science of human sociobiology can shed considerable light on the matter 
of morality by revealing which features of human behavior and social interac- 
tion are adaptive, that is, have in fact contributed to the reproductive fitness 
of their carriers. As I have argued earlier, it is not the gene (in the sense of a 
DNA molecule) whose “selfishness” provides the basis for the definition of adap- 
tation, but rather the “selfishness” of a form of life, including certain patterns 
of behavior. These “selfish” forms of behavior, that is, forms of behavior that 
successfully cause their own replication, include purely altruistic action, such as 
the love of own’s kin or kin-surrogates, and the genuine reciprocity of friendship. 

Some moral virtues are concerned with the management of conflict and the 
generation of cooperation. These are virtues of fairness. The mathematical the- 
ory of games provides a very illuminating way of analyzing social interaction in 
terms of strategic equilibria. A strategic equilibrium is a distribution of behav- 
ioral patterns throughout a population that is self-sustaining. Under favorable 
conditions, equal distributions of benefits constitutes a uniquely stable equilib- 
rium (see, for example Brian Skyrms’s “Sex and Justice” (Skyrms (1994)) or my 
“Gauthier and the Rationality of Justice” (Koons (1994b))). Thus, game theory 
can explain how the exercise of fairness can be adaptive and, hence, functional. 

Once fairness becomes functional, it becomes part of the end, and not merely 
a means. Once human life acquired fairness as one of its functions, fairness 
became part of that which is reproduced, and not merely a factor contributing to 
the reproduction of something wholly distinct from it. Therefore, it is perfectly 
rational to be fair, even when fairness does not carry any collateral advantages, 
since the exercise of fairness is itself an advantage. 

Since human morality makes reference to those teleofunctions of character 
that are actually realized in human life, the principles of morality pertain only 
to our own species. Different standards of morality apply to other animals who 
engage in decision making, if there are any. Does the specific nature of morality 
somehow undermine its objectivity, or suggest that commonsense morality is 
(as Michael Ruse and E. O. Wilson have described it) an “illusion foisted on 
us by evolution”? Certainly not. A competent Martian anthropologist should 
evaluate the morality of human actions and characters exactly as we do, and 
we should evaluate the morality of Martian characters as they do, in each case 
applying the appropriate set of standards. 
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Game-theoretical considerations suggest that many moral principles will be, 
at some level of abstraction, universal, or nearly so. It is difficult to imagine a 
stable form of social life lacking any notion of fairness or loyalty, or in which 
cruelty is an end in itself. 


20.5 Elements of a Teleo-Ethological Morality 


An Aristotelian morality of the kind sketched above will include many virtues 
that are self-regarding and that have little or no relation to fairness and rights. 
For example, many of the virtues discussed by Aristotle and by Hume will 
clearly fall within the scope of a teleo-ethological morality: not just fairness, 
reciprocity, and compassion, but also courage, temperance, and resilience, as 
well as humor, gaiety, industriousness, and fidelity. Love of one’s children, and 
fidelity to one’s spouse, will play a central role in moral theory, and will not have 
to be relegated to a footnote or somehow deduced from a wholly disinterested 
love of humanity in general. A proper degree of self-love and of persistence in 
one’s own commitments and projects will be recognized as more than merely 
permissible. 

The study of punishment and guilt falls within the science of secondary 
functions, functions whose operation presupposes the failure of some primary 
function. Punishment plays an important role in sustaining the practice of 
fairness, respect for others, and peacefulness, since in the absence of punish- 
ment, the relative cost of virtue can climb to excessive heights. A willingness to 
bear a fair share of the burden of punishing wrongdoers is itself an important 
component of social responsibility. Guilt can be thought of as a form of pre- 
emptive self-punishment, reducing, through its obvious presence, the need for 
costly punishment transactions. 

The artificial virtues of justice and of good manners are rooted in the social 
teleofunctions of human nature. Humans have a need to be rooted in a culture 
and a history and to internalize the specific norms and standards of that culture. 
Moreover, as social animals, humans require customs that regulate cooperation 
and conflict with a considerable degree of precision and clarity. So long as these 
specific norms are consonant with the general requirements of human nature, 
the pursuit of artificial virtue is an indispensable part of the pursuit of natural 
virtue. 


20.6 Politics and the Natural Law 


Since human beings are, by nature, political animals, the operation of the state 
fulfills certain teleofunctions. A just, well-ordered state is both a means and an 
end in itself. A just state maximizes the chances for the fulfillment of other, non- 
political teleofunctions, and participation in a just state is itself a component 
of eudaemonia. Consequently, a purely instrumental view of the state, as in the 
political theories of Hobbes or of Bentham, is fundamentally mistaken. The state 
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does not exist merely to secure peace or maximize pleasure: instead, the proper 
functioning of the state is in itself an indispensable part of human welfare.? 

The state has certain natural functions, such as the regulation of conflict and 
the punishment of wrongdoers. The requirements of these natural functions 
constitute the basis for natural law. Far from being “nonsense on stilts” (as 
Bentham labeled it), natural law and natural rights are the consequence of the 
teleological structure of human life, given our natural sociability. 

An application of this theory solves an outstanding problem for rule conse- 
quentialism (of the sort defended by Mill in Of Liberty, for example), namely, 
how to justify the rationality of following the rule in instances where doing so 
causes a net loss in the ultimate value for which the rule exists. If the status of 
the rule or institution is merely instrumental, then the problem is insoluble: it 
is irrational to pursue something as the means to an end when one knows that 
doing so frustrates the end in question. However, from a teleological point of 
view, we can see that certain rules or institutions (establishing certain rights, 
for example) exist because they further certain ends, where the ‘because’ here 
is causal, and at the same time we can recognize that following these rules has 
value as an end in itself, as partly constitutive of our eudaemonia as citizens. 
Once we recognize that such rule-following has value in itself, it is rational to 
forgo other values (even those values explaining the existence of the rule) in 
order to obtain the value of political justice through conforming to right rules. 


20.6.1 The Incoherency of Legal Positivism 


As I noted in chapter 15, there is an analogy between legal positivism (as rep- 
resented by John Austin, H. L. A. Hart, and Hans Kelsen) and a certain kind 
of conventionalism about logic and mathematics (which I called immanent-basis 
conventionalism). In both cases, there is an attempt to ground normativity (in 
the form of rules with definite content) exclusively in social practices, without 
reference to anything transcendent (logical and mathematical facts in the one 
case, principles of natural law in the other). In both cases, a vicious regress 
threatens the coherency of the project. 

The problem can be seen quite clearly by examining the legal positivism of 
Kelsen (1967). Kelsen recognizes that not every pattern of behavior, and not 
even every pattern that is coercively enforced, counts as a legally valid rule. 
There must be meta-rules, norms of legal validity or recognition, that bestow 
legal validity upon the law. So, for example, in Britain the principal norm of 
recognition consists of the principle that a law is whatever has been passed by 
Parliament. In the United States, a statute is recognized as federal law when 
it has been passed by both houses of Congress and signed by the President, 
or passed by a supermajority in both houses overriding a Presidential veto, so 
long as the statute has not been declared unconstitutional by the federal courts. 
These norms are, Kelsen recognizes, themselves rules of law. Their validity must 
be grounded in some yet deeper norm. Ultimately, we reach what Kelsen called 


3See Arnhart’s recent book Arnhart (1998) for a fuller development of this theme. 
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the Grundnorm of the legal system, a rule whose validity is somehow given 
independently of other norms. 

In the analysis of the Grundnorm, Kelsen faced a dilemma. Is the Grund- 
norm itself a valid rule of law, or a raw exercise of power? Qua raw exercise of 
power, the Grundnorm has no legal validity, and so cannot convey any such va- 
lidity to any other rule. The very distinction between the validity of rules in the 
system and the invalidity of patterns outside it comes crashing down. However, 
qua legal rule, the Grundnorm must derive its legal validity from some outside 
source. By hypothesis, the social practices in play provide no more fundamental 
basis than that provided by the Grundnorm. Hence, there must be some prin- 
ciple of natural law that bestows upon the Grundnorm whatever legal validity 
it has. 


20.7 Justice toward Future Generations 


A sharp difference between a teleo-ethological theory of justice and various 
social-contract and libertarian theories, such as those of Rawls, Nozick, or Gau- 
thier, emerges quite clearly when examining the problem of justice toward future 
generations. Suppose that we are contemplating two policies, A and B. The 
two policies will have profound effects on the welfare of people living one hun- 
dred years from now. If we follow A, the welfare of the present generation is 
maximized, but the people alive 200 years from now will live lives that are just 
barely worthwhile. If we follow B, the welfare of the present generation would 
be slightly lower, but future generations, including those 200 years in the future, 
will live lives of comparable value. 

Let G(A) be the set of people who would be alive 200 years from now if 
we now adopt policy A, and let G(B) be the set of people who would be alive 
then if we adopt policy B. The differences in the course of history depending 
on which policy is followed will be profound: who marries whom, and when and 
under what circumstances they conceive and bear children, will depend on which 
policy we adopt. Let us suppose, as seems plausible, that the sets G(A) and 
G(B) are disjoint: the populations that would overlap under the two different 
policies have no common members. 

In this case, it is very hard for a social-contract or a rights-based theory 
to say that there is anything wrong with policy A, since no one is injured or 
wronged by pursuing it. No actual present or future person would be better off 
if policy B had been adopted instead: actual present people would be worse off, 
and actual future people would not exist at all. If the lives of the actual future 
population are worthwhile, we might say that they would have been harmed 
had policy A been adopted instead. 

Moreover, social-contract theories are typically stymied by considerations of 
our asymmetrical relations to our future progeny, since there is no possibility of 
quid pro quo. In social-contract theories from Hobbes to Gauthier, the existence 
of mutual vulnerability and the possibility of mutual advantage lie at the heart 
of the concept of justice. This mutuality is notoriously absent in dealing with 
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the stakes of future generations. 

The situation looks entirely different from a teleo-ethological point of view. 
Natural selection would certainly prefer populations whose characters include 
a profound concern for the prosperity of their posterity. The love of a parent 
for her children and grandchildren, and the collective concern of communities 
for their long-term futures, would lie at the very heart of moral theory, rather 
than being something attached as an afterthought in the third appendix, as is 
typical in modern ethical theory. 


20.8 Kierkegaard and the Teleological 
Suspension of the Ethical 


If all teleology has a Darwinian foundation (as per the thin theory of teleology), 
then all human beings share the same teleofunctions, since our divergence from 
a common ancestral population is too recent for significant differences at the 
level of function to have arisen. Even if that divergence were much earlier, 
an exclusively Darwinian account of teleology entails that, for each individual, 
there is some population that shares all of its teleofunctions. This is so because 
Darwinism operates exclusively at the level of populations. Darwinism can 
explain nothing about an individual except by referring it to some population. 

However, if there are fundamental, extra-Darwinian instances of teleological 
connection, there is the possibility of radically individualized teloi. It would be 
possible for an individual human being to have a teleofunction that cannot be 
derived from any teleofunctions shared by other individuals. This could involve 
the existence of a higher-order causal law that makes reference to the individual 
in question, a personalized causal law, in other words. Alternatively, it could 
be that each individual is, qua that very individual, the product of a one-off 
intelligent design by a creator, in which the intentions of that creator would 
have a unique kind of authority over that individual (as Kierkegaard believed). 

If we identify ethics with the study of the teleofunctions that are common 
to us as humans, or even with the study of those teleofunctions that are com- 
mon to us as members of a given culture, a Sittlichkeit, then the possibility 
of individualized teloi would open up the possibility of the teleological suspen- 
sion of the ethical. That is, the fulfillment of an individual’s eudaemonia might 
require what would be in other humans the violation of some teleofunction. 
Kierkegaard (1985) imagines just such a possibility in Fear and Trembling in 
describing Abraham’s willingness to sacrifice his son Isaac. 

Once again, if we identify reason with the fulfillment of those cognitive func- 
tions that are common to the most specific relevant population of humans, then 
the possibility of individualized eudaemonia would entail the possibility of non- 
rational, even anti-rational, knowledge, which we could call “faith.”4 


4¥For a discussion of the rationality of relying upon such non-rational sources of knowledge, 
see my article “Faith, Probability, and Infinite Passion” (Koons (1993)). 
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A concrete application of Kierkegaard’s ideas can be seen in Dietrich Bon- 
hoeffer’s decision to participate in the plot to assassinate Hitler. In his Ethics 
(Bonhoeffer (1955)), Bonhoeffer develops a Kierkegaardian moral theology in 
some detail, and in the decision to take part in the plot, Bonhoeffer put this 
theology into practice. According to Bonhoeffer, the decision to attempt to take 
Hitler’s life cannot be justified by recourse to any set of rules derivable from gen- 
eral features of human life. Like Abraham in his decision to prepare Isaac for 
sacrifice, the plotters had to be prepared to act as “responsible” men, responding 
to an evident calling specific to them and their concrete circumstances, without 
justification that appeals to timeless or general principles. 

The radical self-sacrifice of heroes and saints, such as Gandhi, Martin Luther 
King, Jr., or Mother Theresa, provide additional (and non-lethal) examples of 
the same kind of individualized calling or vocation that cannot be reduced to a 
pursuit of eudaemonia, generically considered. 

Surprisingly enough, the possibility of such extra-rational and extra-ethical 
functions is amenable to scientific investigation, and Kierkegaard himself points 
the way. It is quite possible that a number of generally human teleofunctions 
presuppose the existence, in each case, of functional elements that are radically 
individualized. That is, it may be that we have, as humans, a general need 
to transcend the general. In The Sickness unto Death (Kierkegaard (1989)), 
Kierkegaard describes this interplay between the general and the individualized 
as the need for both freedom and necessity, or both possibility and necessity. We 
need a structure of general teleofunctions to provide unity to our lives, especially 
diachronic unity (our need for “necessity”). At the same time, we cannot bear 
to sacrifice our individuality utterly in the pursuit of a rigid and unvarying ideal 
of human eudaemonia (hence, our need for “freedom” or “possibility” ). 

According the Kierkegaard, radical evil can be described as the perverse 
fulfillment of our need for freedom. Evil involves the wholesale rejection of any- 
thing independent of my will as the objective and ineluctable end of. choice. 
Evil seeks freedom from the constraint of generalized eudaemonia, not by ap- 
prehending an individualized eudaemonia, but in rebelling against eudaemonia 
itself. This rebellion is impossible, since even in rebelling against the pursuit 
of eudaemonia, we are acting in pursuit of one aspect of it, namely, the need 
for freedom. Carried to its ultimate conclusion, evil leads to the annihilation 
of personhood, since the will cannot bind its own future decisions, and so the 
unity of the person through time that is provided by the pursuit of a single end 
is lost. 

Kierkegaard’s conception of evil or “the demonic” prefigures Nietzsche’s ideal 
of the creation of new values and Sartre’s analysis of humans as “condemned to 
be free” (Sartre (1956)). 


21 


A Coherent Realism Is a 
Comprehensive Realism 


21.1 The Four Waves of Anti-Realism 


A comprehensive form of realism, as exemplified by Plato, Aristotle, and Boethius, 
was the dominant school in Western philosophy from the time of Augustine un- 
til that of Scotus. Today, a comprehensive form of anti-realism, as exemplified 
by Rorty, Foucault, or Derrida, is at or near dominance in the academy. The 
transition from the first state to the last took place in four great waves: Oc- 
cam, Bacon, Hume and the post-modernists. These waves correspond to the 
dismantling, one by one, of Aristotle’s four causes: formal, final, efficient, and 
material. 

Nominalists such as Occam rejected the real existence of properties, types, 
and other universals. All that exists is individual: all predicates and other gen- 
eral terms refer distributively to their many satisfiers, not to a single universal 
entity. Thus, nominalists denied the reality of Aristotle’s formal cause: form as 
such does not exist. 

Although it took several hundred years for this conclusion to be explicitly 
drawn, it follows from the rejection of form that there can exist no real final 
causes. Final causation implies a real relationship between an individual and a 
form that is only partially or imperfectly realized in the present state of that 
individual. If forms are unreal, so are such relationships. 

Descartes, Bacon, and Galileo urged that final causation be banished from 
natural philosophy. This was to some extent justified by the over-reliance of 
Aristotelians on final causation, especially in physics. Moreover, the concen- 
tration of scientific research on matters of efficient causation undoubtedly con- 
tributed to the rapid growth of physical and chemical sciences in the early 


1Students of Richard M. Weaver will recognize the influence of his analysis of modern 
history in Ideas Have Consequences (Weaver (1948)). 
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modern period. However, the banishment of final causation to the realms of 
a priori psychology and revealed theology was unjustified and has done great 
harm to both philosophy and science. 

Bacon and Descartes did not deny the existence of final causation absolutely, 
but they denied its existence within nature. All final causation was made depen- 
dent on the intentions of conscious agents, whether human or divine. Anything 
that is not a human artifact could have a proper function only by reference to 
the design intentions of God. This identification of final causation with divine 
intention led to the subsequent confusion by many of teleological explanation 
with the attribution of perfection or optimization. 

Once final causation was relegated to revealed theology, it was inevitable 
that a Hume would appear, who would attempt a thoroughly non-teleological 
account of the human mind. Epistemology thus became the study of the op- 
erations of the human mind, without reference to the proper functions of the 
human faculties. As Hume so clearly saw, the operationalist empiricism that 
results undermines the rationality of induction and renders causal connection 
inaccessible. Consequently, the third of Aristotle’s causes, the efficient cause, 
went under. Kant attempted to minimize the damage of this loss by making 
causation an unavoidable projection of the finite understanding, rather than the 
accidental result of associations in this or that individual human being. With 
Hume and Kant representing the two alternative poles, one of individualistic 
subjectivism, and the other of universal, inter-subjective anti-realism, modern 
philosophy has sought out many devices for reconstructing epistemology and 
ethics without the use of either final or efficient causation, without notable 
success. 

Post-modernism has been the natural response to the evident failures of 
modern philosophy. Without final or efficient causation to tie human ideas to 
objective reality, the materialistic story of modern scientific philosophy becomes 
merely one story among many equally legitimate alternatives. Since truth is 
impossible, reason becomes optional. 

Post-modernism will turn out, I believe, to be a transitional episode, and 
not a permanent condition. The absolute indifference to intellectual discipline 
that post-modernism fosters will inevitably provoke a reaction in the opposite 
direction. Indeed, the reaction has already begun, as evidenced by the Aus- 
tralian materialism of David Armstrong, Frank Jackson, and others, and the 
teleological naturalism of Millikan, Dretske, and Papineau. A coherent and 
viable alternative to the failures of modern philosophy and the vacuity of post- 
modernism must, and I think will, be built on the restoration of all four of 
Aristotle’s causes. By recognizing that our cognitive faculties are objectively 
ordered to the end of truth, and by recognizing that universal types are every 
bit as real as particular instances, we can successfully depend on the possibility 
of both truth and knowledge. Moreover, since our volitional faculties are also 
objectively ordered to a systematic end -—- human eudaemonia — we can close 
the infamous fact/value gap and restore ethics to its rightful place among the 
sciences. 
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21.2 A Prolegomenon to Any Future 
Critique of Metaphysics 


Since the work of Hume and Kant, the work of metaphysics has taken place 
under a cloud of suspicion. Empiricists and positivists have held metaphysics 
to be unscientific because it postulates entities, causal connections, substances, 
universals, numbers, etc., that are not directly verifiable by the senses. 

Metaphysicians have not been alone in this predicament. Scientists who 
insist on interpreting the theoretical entities of science realistically fall under 
the same suspicion. Locke was skeptical not only about scholastic metaphysics, 
but also about Newton’s mechanics, and van Fraassen rejects not only universals 
and causal connections, but also electrons and magnetic fields. 

The central dogma underlying the positivist critique of metaphysics is the 
privileged status of sense perception. Whatever can be justified can be justified 
(according to the positivist) in terms of sense perception, or sense perception 
plus deductive logic. The positivists owe the rest of us an explanation of why 
we should grant this exclusive privilege to one or two modes of knowing, at the 
expense of all others. 

The basis for the privileging of sense perception seems to be the matter of 
reliability. There are two reasons for thinking that our knowledge of our own 
sensory surface stimulations (to use Quine’s phrase) is more reliable than our 
knowledge of other facts: causal distance and inferential distance. The process 
conveying information to me from a rock or an electron is much longer than 
the process conveying information to me from-the immediate environment of 
my sense organs. Similarly, the process carrying information to my own sense 
organs is much shorter than the process by which natural selection conveys 
information to the innate structures of my natural kind. A longer process is 
more susceptible to malfunction, ceteris paribus. Hence, the shorter process 
is more reliable. Similarly, any knowledge gained by inference from sensory 
knowledge involves additional steps, during which additional errors can occur. 

However, ceteris is not always paribus. As Fred Dretske has pointed out, 
our knowledge of distal facts is often much more reliable than our knowledge of 
proximal stimulations. I am much better at learning the pattern of the distribu- 
tion of furniture in my office than I am at learning the pattern of stimulation of 
my retina. My innate knowledge of arithmetic is more reliable still, and much of 
our inferential knowledge, for instance, our knowledge of the power of gravity, is 
more reliably formed than our knowledge of the results of any single experiment. 

Where positivists and empiricists are right is in insisting that there be the 
possibility of some kind of causal connection, direct or indirect, between us and 
any postulated entity. In the absence of such a causal connection, there can 
be no reliability, and where there is no reliability, there can be no knowledge. 
Where they are wrong is in limiting this causal connection to the five senses. 

A philosopher who is empirical in spirit rejects a priori certitudes in philos- 
ophy as bad methodology. This must include rejecting the a priori certitudes of 
empiricism. 
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21.3 Causalism, Yes! Materialism, No! 


On the questions of the philosophy of mind, analytic philosophers have tended 
to divide into two camps: the naturalists and the mysterians. The naturalists 
hold to some form of the mind/brain identity thesis, insisting that all the facts 
there are can be accommodated within materialism. The mysterians insist, to 
the contrary, that there is a subjective, introspectible aspect to consciousness, 
and, perhaps, that there is a phenomenon of basic and underived intentionality 
that cannot be accounted for by materialistic theories. 

I am by and large sympathetic to the strategy undertaken by contemporary 
naturalists: (1) to explain the phenomenal character of conscious experience in 
terms of intentionality, (2) to explain intentionality by means of information 
and proper function, and (3) to give a causal account of both information and 
proper function. This strategy, however, is available to non-materialists as well. 
The resulting account of the mind is better termed the “causalist theory of the 
mind,” rather than the “materialist theory of the mind.” 

A non-materialist causalism has one major advantage over the materialist 
account of the mind: explaining the causal efficacy of mental states. On the 
causalist account, a token mental state consists of two parts: one supporting 
certain physical and physiological types relating to the central nervous system, 
and one supporting higher-order causal connections between those states and 
their intrinsic purposes (the carrying of information or the execution of behav- 
ior). Since a materialist is committed to the dogma that only spatiotemporally 
located tokens can be causally efficacious, he must hold that only the physical 
component of a mental state causes behavior; the teleofunctional component is 
causally otiose. On the causalist account, in contrast, higher order tokens can 
interact with tokens whose functions are themselves higher-order in nature, i.e., 
second-order tokens can be causally efficacious in interactions with third-order 
tokens. 

In any event, there are several good reasons for rejecting materialism that 
are entirely independent of the issues in the philosophy of mind. 


e The causal efficacy of modal facts, and, through them, of logical and 
mathematical facts (chapter 15). 


e The constructibility of metrical spacetime as a simple approximation to 
the qualitative relations determined by the causal network (section 4.10.2). 


e The existence of an extra-spatial first cause of the cosmos (chapter 8). 


e The existence of a cause of the simplicity of the causal structures underly- 
ing many observable phenomena, as required for a realistic interpretation 
of scientific theory (section 17.5). 


e The refutation of the principles of materialistic compositionality and causal 
locality by the failure of Bell’s inequalities for quantum phenomena (sec- 
tion 18.5). 
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In part I, I demonstrated that modal and causal facts that are themselves 
not spatially or temporally located can act as causes of concrete events. In 
chapter 12, I used higher-order causation to give an account of the teleological 
connections we find throughout the biological and human worlds. In chapter 
15, I argued that such an account of higher-order causation is needed if we 
are to have an account of the possibility of logical and mathematical thought, 
in particular, in order to solve Benacerraf’s problem of the indeterminacy of 
reference in mathematics. These non-spatiotemporal facts are counter-examples 
to materialism, which is committed to the view that only entities located in space 
and time can be causally efficacious. 

In section 4.10.2 of part I, I indicated that a mereotopological approach to 
qualitative, commonsense spatial and temporal relations could be based on the 
theory of causation. I further suggested that the metrical spacetime of physical 
theory is based on a much simplified picture of these qualitative spatiotemporal 
relations. Consequently, it is unreasonable to assume that spacetime is a uni- 
versal receptacle into which all situation-tokens must be fit. The most we can 
ask of the spacetime of physics is that it provide a very simple, useful framework 
into which many tokens and their relations can be fit with at least approximate 
success. 

In chapter 8, I argued that considerations based on the apparent universality 
of causation should lead us to the conclusion that there is a necessary fact that is 
the uncaused first cause of all wholly contingent states. This necessary fact most 
probably involves no entities that are material or spatiotemporal in character, 
since any fact involving such entities would be at least partially contingent. 

In section 17.5 and in a forthcoming article (Koons (2000)), I argued that a 
realistic interpretation of scientific theory depends on the objective reliability of 
our inductive methods, including our preference for simplicity. This objective 
reliability in turn depends on the existence of a cause that explains why many 
observable phenomena are the product of relatively simple causal structures. 
This cause of the uniform simplicity of observable phenomena must itself be 
non-physical in nature. 

Finally, I have argued in chapter 18 that the failure of Bell’s inequalities 
strengthens the case for rejecting the ontology of materialism. The attractive- 
ness and plausibility of materialism depends on the principle of materialistic 
compositionality: the thesis that any fact about any composite entity can be 
explained (without remainder) in terms of intrinsic facts about its parts and 
their spatiotemporal relations. 

Materialistic compositionality is analogous to compositionality in linguis- 
tics. The meaning of complex expressions in a compositional language can be 
deduced from the intrinsic meaning of the parts of the expression, together 
with their spatiotemporal relations. This means that the meanings of complex 
expressions are never strongly emergent, so we do not have to resort to some 
non-recursive faculty of interpretation to explain our understanding of novel 
sentences. Similarly, materialistic compositionality means that we can explain 
any novel physical phenomenon on the basis of a classification of its parts by 
their intrinsic characters, an account of their spatiotemporal relations, and the 
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use of a finite number of functions. The connection between materialistic com- 
positionality and materialism lies in the assumption that the only relations that 
matter are spatiotemporal relations. If we permit other relations between parts 
to figure in our canonical explanations, we open the door to ghostly and mys- 
terious relations, like being neurons in the relation that carries the intrinsic 
meaning red. 

Both Democritean atomism and Einsteinian field theory satisfy the principle 
of materialistic compositionality. In the case of field theory, the number of 
parts of a material object is non-denumerable, since each point in spacetime 
constitutes a part of the field. However, given the field strengths at each point 
and the spatiotemporal relations between the points, we have all we need to 
explain every physical phenomena, according to general relativity. 

However, materialistic compositionality is demonstrably false, thanks to the 
falsification of the Bell inequalities by quantum results. The failure of the Bell 
inequalities leaves us with only four options: 


1. Reject the thesis that quantum objects have spatiotemporal relations (the 
Copenhagen interpretation). 


2. Allow for superluminal influences (pilot waves or Everett-style world- 
splitting). 


3. Allow for backward causation (the influence of present experimental set- 
tings on past events). 


4, Recognize the existence of irreducibly non-spatiotemporal relations among 
distant physical parts (holism). 


Of these, only (2) and (3) are compatible with materialistic compositionality. 
Both (1) and (4) explicitly contradict compositionality: (1) because it denies 
that all of the parts of a physical system have spatiotemporal relations to one 
another, and (4) because it requires that relations other than spatiotemporal 
ones play an irreducibly real role in physical causation. Options (2) and (3), 
however, still undermine materialism, since they entail that no two physical 
systems can be causally isolated from each other. This means that the resolu- 
tion of complex phenomenon into simple parts gets us no closer to a complete 
explanation of the phenomenon, since each of those parts can interact with an 
unlimited number of remote factors. 

The Bell inequalities force us to recognize strongly emergent properties in 
each classical system, that is, properties that do not supervene on the intrin- 
sic characters (stable states) plus the spatiotemporal relations (if any) of their 
quantum-level parts. If we adopt, as seems most reasonable, some variant of the 
Copenhagen interpretation (such as Heisenberg’s), we are left with the conclu- 
sion that position and velocity themselves are such strongly emergent properties. 
Our quantum-level parts have no position, although we may attribute something 
like position to them when they interact with a classical measurement system. 
This attribution of intermittent position to quantum particles should be thought 
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of as only analogous to the attribution of position to the measurement device 
itself. Quantum-level phenomena are unimaginably strange. The principle of 
no-action-at-a-distance simply does not apply to them, since they do not really 
stand at a distance from one another or from us. My own actual position in space 
is not determined by any feature of my quantum-level parts, since compounding 
quantum systems yields only more complex probabilistic wave functions, never 
definite spatiotemporal properties. 

Consequently, there is no reason to assume that my psychological state is de- 
termined by the physical features of my body’s parts. Since the physical world is 
itself divided, home to strongly emergent properties, respect for physics provides 
no reason to rule out the possibility of properties that are strongly emergent 
relative to both the quantum-level and classical-level physical attributes of the 
body. 

I have discussed strict materialism, rather than “physicalism,” because phys- 
icalism is too vague and indeterminate a doctrine to serve as the topic of a coher- 
ent philosophical discussion. If “physicalism” means that science will eventually 
be unified, with a single set of laws and concepts, then no one today defends 
such a doctrine (to my knowledge). If physicalism means that everything that 
exists and every actual causal explanation has a “complete” description in the 
“ideal physics of the future,” then the doctrine is so vague as to be essentially 
meaningless. Does this doctrine entail the spatio-temporal contiguity of causes 
and effects, or mereological compositionality? Does it exclude the existence of 
an uncaused first cause? Who knows? Who can tell? The history of physics 
over the last one hundred years gives us reason to believe that the physics of 
the future will be unimaginably different from today’s theories. 

If “physicalism” means that a complete and sound description of all causal 
connections can be given in terms of today’s best physical theories, the doc- 
trine is, besides being wildly implausible, still seriously underdefined, since to- 
day’s physics includes quantum mechanics, which is subject to a wide variety of 
metaphysical interpretations. Metaphysical theory is radically underdetermined 
by an austere mathematical formalism such as quantum mechanics. For these 
reasons, instead of criticizing physicalism, I have chosen to criticize a precise 
metaphysical position, that of strict materialism. 


21.4 Anti-Realist Obscurantism 


Arguments for anti-realism typically take the following form: 
1. It is difficult to account for our epistemic access to facts about X. 
2. Therefore, we have no epistemic access to facts about X. 


3. Therefore, either (1) there are no facts about X, or (2) the facts about X 
are merely projections of our own judgments about X, when made under 
ideal circumstances. 


286 Realism Regained 


For many domains (ethics, mathematics, theory of universals, causation), 
premise (1) is clearly true, and the inference from (2) to (3) seems correct. 
However, the inference from (1) to (2) is clearly a weak point. Anti-realism is 
simply a strategy for dodging the hard problems of epistemology, for taking by 
theft what ought to be earned by hard toil (in Russell’s phrase). 

The difficulties referred to in premise (1) are due to inadequacies in our 
models of causation and of knowledge itself. In this book, I have begun to 
develop a conception of causation that is flexible enough to accommodate real 
causal connections between concrete events and timeless conditions, such as 
modal constraints. I hope that I have at least provided some basis for hope that 
the difficulties the anti-realist points to are not insurmountable. 


21.5 Is the Theory Naturalistic? 


Naturalism is all the rage these days, so it is natural to wonder whether the 
theory I have sketched in this book qualifies as naturalistic. There seem to be 
three characteristics shared by most who consider themselves naturalists: 


1. The rejection of a scientifically inaccessible realm of subjectivity; causal 
relevance as the criterion for knowability. 


2. The continuity of philosophical method with the methods of natural sci- 
ence. 


3. A physicalist, or at least materialist, ontology. 


By these standards, my theory is two-thirds naturalistic, since I fall in line 
with the first two characteristics. I am very resistant to acknowledging the 
existence of subjective facts that are accessible only from a first-person per- 
spective. Reality (insofar as we can know it) consists of a causally connected 
network. The very notion of reliability has no application to an irreducibly 
first-person mode of knowing, as Wittgenstein argued in Philosophical Investi- 
gations. Since knowledge entails reliability, this means that such a concept of 
first-person knowledge is incoherent. 

I also follow the same method in philosophy, namely, inference to the best 
explanation, that characterizes good methodology in science. There may be 
some difference between my use of this method and that of many philosophical 
naturalists, since I take the data of philosophy to include more than sensory 
observation. Non-inferential knowledge of logic, mathematics, and ethics also 
counts as legitimate data for philosophical theorizing. 

As I have made already made clear, I reject physicalism and materialism. 
Causal efficacy is not limited to space and time. Modal facts can be causally 
effective, despite that fact that they are timeless and placeless. Moreover, there 
are genuine instances of higher-order causation, in which timeless causal facts 
impinge upon the concrete events of spacetime. Metrical space and time are 
constructs that merely approximate the richness and complexity of the world’s 
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causal structure. The failure of Bell’s inequalities in the case of quantum phe- 
nomena provides decisive evidence against the principle of metaphysical com- 
positionality that forms the core of the materialist’s research program.” 


2For further arguments against the objectionable sort of naturalism, see the forthcoming 
volume, Naturalism: A Critical Appraisal (Craig and Moreland (2000)), edited by William 
Lane Craig and J. P. Moreland. 


Appendix A 


Partiality, Modality, 
and Conditionals 


A.1 Partial Propositional Logics 


It is conventional in philosophical logic to refer to both three- and four-valued 
logics as “partial logics.” A three-valued logic recognizes the values true, false, 
and neither (undetermined). Four-valued logic adds the possible value of both 
true and false. Three-valued logics are useful in representing partially undefined 
or ontologically incomplete situations. Four-valued logics enable us to deal both 
with incomplete and with logically impossible situations. 

For most purposes, three-valued situation theory will suffice, since all actual 
and possible situation-tokens bear one of three relations to each situation-type 
(verify, falsify, or neither). However, there are two reasons for being interested 
in four-valued logic. First, there are concerns of symmetry and elegance. The 
four values form a kind of lattice, and in many cases, the logic and semantics 
of four-valued systems are simpler than those of the corresponding three-valued 
systems. Second, I am interested in situations that are partial with respect to 
information about logical necessity. Such situations do not recognize the impos- 
sibility of certain logically impossible situations. In order to model such logical 
impossibility, it is convenient to make use of the fiction of logically impossible 
(four-valued) tokens. Logically partial situation-tokens will enable us to recog- 
nize the causal efficacy of logical necessities, which in turn will make possible a 
causal theory of logical reference and knowledge (chapter 15). 

Here again are the three-valued (strong Kleene) truth tables for negation, 
disjunction, and conjunction. 
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The corresponding tables for four-valued logic (the Dunn (1976) tables) are 
as follows: 


The unifying idea behind both the strong Kleene and the Dunn tables is 
simply this: one computes truth-values as one one would in classical semantics, 
except that one separates the determination of truth and of falsehood: 


1. n¢ is (at least) true if ¢ is (at least) false. 
. a¢ is (at least) false if @ is (at least) true. 
. 6&Y is at least true if both ¢ and w are. 
. 6& wy is at least false if either ¢ or w are. 


. @V w is at least true if either ¢ or w are. 


ao 7m F&F WH bw 


. oVw is at least false if both ¢ and w are. 
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By ‘at least true’, I mean either T or B (either true only or both true and 
false), and by ‘at least false’ I mean either F or B (either false only or both 
true and false). A proposition that receives no truth-value from these principle 
retains the classification U. 

The use, within semantic models, of impossible situation-tokens is no more 
troubling, ontologically speaking, than the use of possible-but-not-actual to- 
kens. The only real tokens are actual tokens. Non-actual tokens, whether 
possible or not, are merely useful fictions. Modality (possibility, contingency, 
necessity) pertains primarily to actual situation-types. A paradigmatic modal 
fact is something like: type ¢ is possibly instantiated. Since the work of Kripke 
and Kanger, it is widely recognized that it is useful, in representing the seman- 
tics and logic of modal facts, to construct models containing indices that stand 
for merely possible worlds. Each possible world represents the compossibility 
(from the point of view of certain worlds) of a set of types. Similarly, in four- 
valued modal logic, I will use impossible situation-tokens to represent the lack 
of the impossibility (from the point of view of logically partial situations) of the 
co-exemplification of certain types. 

The classical connectives (negation, conjunction, etc.) are functionally com- 
plete with respect to classical two-valued interpretations. Every classical truth- 
function is definable by means of negation and conjunction alone (and also by 
means of negation and disjunction alone, as well as several other combinations of 
classical connectives). The classical connectives are not, however, functionally 
complete with respect to three-valued or four-valued interpretations. To achieve 
functional completeness, we would have to add several new, non-classical con- 
nectives. However, as Thijsse (1992) has proved, the classical connectives are 
functionally complete with respect to an important class of three-valued func- 
tions, namely, those that have the properties of truth-functional persistence, 
classical closure, and freedom. A truth-function is persistent just in case: when- 
ever the truth-values of the inputs are enriched, the truth-value of the output is 
also enriched. By enriching a truth-value, I mean moving from undefined to one 
of the other truth values, or moving from one of the classical truth values to the 
value both true and false. Classical closure requires that whenever the inputs 
are limited to the classical truth-values, the output is also classical. Finally, the 
property of freedom entails that whenever the truth-values of the inputs are all 
undefined, the truth-value of the output must also be undefined. 

I have not found any need for logically complex types whose relation to their 
constituent types does not satisfy all three of these requirements. In particular, 
the property of truth-functional persistence is closely related to the interpreta- 
tion of the part-whole relation on situation-tokens. Each coherent situation is 
part of a possible world: as we move from less to more inclusive situations, the 
associated truth function should converge to the classical (bivalent) case. Hence, 
we want formulas to have the property of mereological persistence. A class of 
formulas is mereologically persistent if whenever a situation s verifies a formula 
@ and s LC s’, then s’ also verifies ¢. If we stipulate that any interpretation of 
the language makes the atomic formulas mereologically persistent, and we use 
only truth-functionally persistent connectives, then it follows that all formulas 
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of the language are mereologically persistent. 

Classical closure is important, since we want any use of the overdetermined 
value (both truth and false) to be forced by an overdetermination of the value 
of some atomic formula. The property of freedom will have little bearing on the 
project, since I will assume that every situation verifies some formula. Conse- 
quently, I will deal only with the classical connectives throughout this project. 

In the case of four-valued interpretations, this situation is even simpler. If 
we define the dual relation as holding between true and false, and also between 
undefined and overdefined (both true and false), then Thijsse (1992) has proved 
that the classical connectives are functionally complete for the class of persistent, 
duality-preserving truth functions. 

In partial logic, there are no logically true propositions: no proposition is 
true in every three- or four-valued interpretation. We do, however, have non- 
trivial logical implication. In fact, there are a variety of relations that are 
species of logical implication or consequence in partial logic. In three-valued 
logic, there are three notions of logical consequence that seem most natural: 
verification validity, falsification validity, and double-barreled validity. A set T 
verifiably entails a set A just in case every interpretation that verifies every 
member of T° verifies some member of A. A set I falsifiably entails a set A 
just in case every interpretation that falsifies every member of I’ falsifies some 
member of A. The relation of double-barreled implication (first suggested by 
Blamey (1986)) holds between a set [ and A just in case I’ both verifiably and 
falsifiably implies A. Whenever I talk about implication in three-valued logic, 
I will mean double-barreled implication, since this comes closest in many ways 
to the classical case. 

In four-valued logic, the situation is much simpler, in that verification im- 
plication, falsification implication and double-barreled implication all coincide 
(Muskens, 1995, p. 77). 

Muskens (1995) has proved that the following system of rules (rL**) is 
complete for double-barreled implication in three-valued propositional logic: 


e (R1) Ad Ho 

R2) +($& b) H 9 Vw 

R3) (6 Vb) H 6 & 
R4) d6& wk ¢ and G&yt wy 


( 
( 
( 
( 
© (R5) dE dVyp and vEgVy 
( 
( 
( 
( 


e (R6) If d,pt x, and p, pt x, then (¢V ), pr x. 
e (R7) IfxF p,¢, and xt wp, then x + (&Y),p 
e (R8*) 6&AdE WV Ay 


e (R9) If d+ wand wt x, then dF x. 
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e (R10) P+ A if and only if there are nonempty sets {a1,...,%»} CT and 
{fi,...,8n} C A such that a, & ... {amt BV... V Bn. 


For four-valued logic, Muskens demonstrates that the system rL is sound 
and complete, where rL is rL+*— (R8*). 


A.2 Partial Modal Logics 


I take it as obvious that there is some important connection between causation 
and modality. As Hume famously observed, causation involves some kind of 
“necessary connection.” Consequently, I need to make use of a partial version 
of modal logic. Fortunately, the groundwork for partial modal logic has already 
been laid by Thijsse (1992) and Muskens (1995). In this work, I will follow 
Muskens very closely. 

A model in partial model logic shall consist of a quintuple (Sit, R', R!, J, 
C). Sit is the set of situation-tokens, which are essentially partial (incomplete 
or overdetermined) worlds. The part-whole relation C is a partial ordering 
on the class Sit. The interpretation function assigns truth-values (true, false, 
neither, or both) to atomic symbols, representing persistent situation-types. 
Whenever s C s’, I will require that I(¢,s’) be an enrichment of I(¢,s) for 
every atomic symbol ¢. (Remember: every other value is an enrichment of the 
value undefined, and the value both is an enrichment of the two classical values.) 

The relations R' and R! are binary relations on Sit. These are the outer 
and inner accessibility relations. If we have sR's’, then we fail to falsify the 
accessibility of s’ from s: the model treats the accessibility of s’ from s as either 
definitely established or undefined. Dually, if we have-sR!s’, then we definitely 
verify the accessibility of s’ from s. This gives us four possible values for the 
accessibility of s’ from s: 


e The relation is undefined: we have sR!'s’, but not sR!s’. 

e The relation is verified only: we have both sR!'s’ and sR! 3’. 

e The relation is falsified only: we have neither sR's’ nor sR! s’. 

e The relation is both falsified and verified: we have sR!s’, but not sR!'s’. 


Whenever a situation-token s is logically possible, we have both that I(¢)(s) 
is true, false, or undefined for every ¢ (never both true and false), and the image 
R![{s}] is a subset of the image R'[{s}] (no situation is both definitely accessible 
to s and definitely not accessible to s). 

I will require that modal facts be persistent. So, if s C s’, then R'[{s’}] C 
R'[{s}], and R![{s}] C R![{s’}]. In other words, as we move from a smaller to 
a larger situation-token, the set of definitely inaccessible tokens monotonically 
increases, as does the set of definitely accessible tokens. 

The truth and falsity definitions for the modal operators are quite simple: 


e M,s - O¢ & 742 € Ri[{s}|M,z - -¢. 
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e M,s-70¢ @ dre Ri{s\]M,x ke ad. 
eo Ms O¢ 6 Ar € R'I[{s}|M,2 5 ¢. 
e Ms 7O¢6 © 7dr € RI[{s}|M, zk ¢. 


The formula O¢ (the necessity of d) is true in a model at a situation s just 
in case ¢ is never falsified by any token in the outer accessibility set of s (the set 
of tokens that are not definitely inaccessible). This formula is false in a model 
at s if it is falsified by some situation-token in the inner accessibility set of s 
(the tokens that are definitely accessible). The possibility operator © is defined 
as the dual of O. 

In these definitions, I deviate from the pattern of both Thijsse and Muskens, 
since for my purposes, it is essential that I make all modal facts persistent (with 
respect to moving up the part-whole ordering C), which the Thijsse-Muskens 
truth definitions fail to do. Nonetheless, it is easy to demonstrate that the logic 
of double-barreled consequence in the three-valued case is characterized by the 
Thijsse system M*, and the four-valued logic is characterized by the system 
M-——, one of Thijsse’s systems (Thijsse, 1992, p. 104). 

The system M* consists of rL+* plus the following rules: 


e (R11) On¢ H 704 
e (R12) 0nd H =O¢ 
(R13) O(d VP) K OPV OW 
e (R14) If dF y, then OP Oy. 
e (R15) O(6 & 7g) F OY V wy) 
(K) O(¢ > ), Od F OY 


e (KNec) If ¢ is a theorem of the classical (two-valued) system K, then 
t Od. 


The system M* agrees with the classical modal logic K with respect to all 
theorems that are preceded by a box, since it is impossible to find coherent mod- 
els that falsify any classical validity, and the truth definition of 0¢ guarantees 
its truth whenever ¢ cannot be falsified. 

The system M~-~, which characterizes four-valued modal logic, consists of 
rL plus rules (R11) through (R14), plus two additional rules, (R16) and (R17): 


e (R16) OPd& OWE O(¢& y) 
e (R17) If + y, then Od+ OY. 


The completeness of these two systems for their respective sets of models can 
be demonstrated quite easily by means of the standard construction of canonical 
models. In the case of three-valued (or coherent) modal logic, the set Sit in the 
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canonical model consists of the set of consistent, saturated theories of the logic 
M*. A theory of a modal system is a set of sentences that is closed under the 
rules of the system. A theory is consistent if it does not contain both ¢ and —¢, 
for any formula ¢. A theory is saturated if it contains either @ or whenever it 
contains the disjunction ¢ V ~. The interpretation function J for the canonical 
model is defined thus: for each atomic formula ¢, if ¢ € TI, then J(¢)(T) = T, 
if 1g ET, then I(¢)(T) = F, and otherwise [(¢)([.) = U. Two theories T and 
A are in the part-whole relation EC just in case T is a subset of A. Finally, we 
need to define the two partial accessibility relations R’ and R! for the canonical 
model: 


TRA 6 Vo(‘Ow eT ‘a’ ¢ A) 


TRIA & VWu(‘w € A ‘Ow’ ET) 


In the case of four-valued (general) partial modal logic, the canonical model 
is constructed in exactly the same way, except that the class Sit is the set of all 
saturated theories of M~~, whether or not they are consistent. 

For these completeness proofs, we need a partial version of Lindenbaum’s 
Lemma: 


Lemma A.1 (Lindenbaum’s Lemma) /fT // A, then there is a saturated 
theory I’ such thatl CI’ andIYNA=9. 


This partial Lindenbaum’s lemma can be proved by a slight modification of 
the usual proof (Muskens (1995)). 

We then prove by induction a canonical model theorem, showing that for 
every situation s in the canonical model (that is, for every saturated theory), 
and every formula ¢ of the language, ¢ € s if and only if Mcan,s  @¢, and 
nag € s if and only if Mcan,s F 7d. This proof follows the usual one in the 
atomic cases and in the cases corresponding to the propositional connectives. 
In the case of the modal connectives, we must make use of the partial-logic 
Lindenbaum’s lemma. 

For example, in the case of the necessity operator 0, we must show that 
O¢ € s if and only if O¢ is true at s. If Od € s, then the fact that O¢ is 
true at s follows immediately from our definition of R' for the canonical model, 
and from the truth definition for O. If Od ¢ s, then we must use Lindenbaum’s 
lemma to construct a saturated theory F containing -¢ and disjoint from the set 
A(s) = {~ : O-w € s}. We know that we cannot derive any member of A(s) 
from —¢ because, if we could, then, since derivation satisfies contraposition, 
there would be a formula wy such that Ow € s and wt ¢. By rule (R17), this 
would mean that O¢ was in s, contrary to our assumption. 

There is an alternative way of seeing that the partial modal logics must 
be axiomatizable. We can make use of a translation from partial modal logic 
into classical modal logic, using a technique developed by Gilmore (1974) and 
Feferman (1984). For each atomic formula ¢ in the language of partial modal 
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logic, we introduce two distinct formulas into the classical language: @+ and ¢~. 
We also introduce into the classical logic two independent modal operators, 01 
and O!, together with their duals. It is convenient to define two complementary 
translations, + and —. Each atomic formula is translated via + by means of 
the corresponding positive version in the classical language and via — by means 
of the corresponding negative version of the formula. 


e (p&y)* = (ot &y*) 
e (-9)F = 6° 

© (4)* = OT ag~ 

© (O¢)* = Ol dt 

© (¢&y)" = (PVT) 
e (4) = ¢t 

e (Q¢)- = O!¢d- 

© (O¢)7 = OTagt 


In the classical language, we assign independent truth values to the positive 
and negative versions of each atomic formula, and we use two independent 
accessibility relations, one for the | modalities and one for the | modalities. A 
set T entails A in partial modal logic if and only if two conditions are met: (1) 
the translations of I’ entail the translations of A in classical modal logic, and 
(2) the translations of the negations of A entail the translations of the negations 
of [ in classical modal logic. In the case of four-valued modal logic, these two 
conditions coincide. 

This correspondence between partial modal logic and classical logic enables 
us to transfer to partial modal logic many of the familiar results of classi- 
cal modal logic, such as decidability and the finite model property (Muskens 
(1995)). 


A.2.1 Reflexive Models 


If we require, as seems natural, that the accessibility relations be reflexive, then 
we can strengthen the two systems M* and M~-~. In the case of three-valued 
modal logic, the class of reflexive models can be characterized by adding two 
new rules: 


(Refll) (¢&On7¢)+ 7d@ 
(Refl2) @+ (O¢dV -7¢) 
In the four-valued case, I do not believe that we can characterize the class 
of models in which R! is reflexive. 
One solution to this latter problem is to introduce a hybrid logic. In models 
of this logic, there is a single designated situation g (the actual situation, intu- 
itively). We can require that g be coherent, in the sense that I(g) assigns only 
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T, F, or U to all atomic formulas, and R![{g}] C R'[{g}], and in addition, we 
could require that every situation in R![{g}] (the situations that are definitely 
accessible to g) be similarly coherent. Other situations in the model, however, 
could be logically incoherent, requiring the use of the four-valued truth func- 
tions. We could define logical consequence for this system by reference to these 
distinguished worlds g: T entails A if and only if: 


1. every model M such that M,g verifies every member of I is such that 
M,9m verifies some member of A, and 


2. every model M such that M,g falsifies every member of A is such that 
M, gm falsifies some member of I. 


The logic for this system, M™, would consist of rules (R1) through (R17), 
but would lack rules (K) and (KNec). In addition, we could characterize the 
class of reflexive models by using rules (Refll) and (Refl2). 

If we restrict our attention to possible worlds, situation-tokens that are co- 
herent and complete, then we would return to the classical two-valued modal 
logics, such as T, S4, or S5. My intention is not to argue that partial modal 
logic should replace classical modal logic. We still need classical modal logic to 
characterize a certain kind of validity. For instance, the inference from O¢ to ¢ 
is not locally valid, since there are many situation-tokens that verify the first but 
not the second. However, this same inference is globally valid, since any token 
that verifies the first is embedded in a possible world that somewhere verifies 
the second. The corresponding T axiom, (O¢ — @), is not verified at every 
situation-token, but only because many tokens contain only partial information 
about modality. As the modal information supported by a token is enriched, we 
approach classical modal logic at the limit. Partial modal logic is important in 
representing facts about causal connections between partial tokens, as we saw 
in chapter 7. 


A.3 Partial Conditional Logics 


In chapters 4 and 5, I argued for the possibility that all of the causal laws of our 
world are “oaken” rather than “iron” (to use D. M. Armstrong’s distinction), 
that is, that all of the actual causal laws admit of exceptions. In addition, I 
argued that causes do not strictly necessitate their effects. Instead, I developed 
in chapter 7 an indeterministic model of causation, in which causes make their 
effects extremely probable but not absolutely certain. 

In constructing such a model, IJ made use of what Michael Morreau (1997) has 
called “fainthearted conditionals.” These conditionals, which I have symbolized 
by means of O—, have a logic and a semantics that is very similar to that of the 
counterfactual or subjunctive conditionals investigated by Robert Stalnaker and 
David Lewis. Ernest Adams (1975) was the first to investigate the properties of 
such probabilistic, fainthearted conditionals and to note their logical similarities 
to the Stalnaker/Lewis conditionals. 
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David Lewis (1973) developed a “system of spheres” semantics for the sub- 
junctive conditional. The spheres of these models were intended by Lewis to 
represent varying degrees of similarity of the contained worlds to a designated 
world. This semantics can, however, also be given a different, probabilistic in- 
terpretation. In the case of a finite model, each sphere can be thought of as 
representing the intersection of all those sets whose probability is greater than or 
equal to 1 —e, where € belongs to some fixed order of infinitesimals. In this way, 
Lewis’s system-of-spheres semantics can be given an interpretation in terms of 
qualitative probabilities: ¢0— w represents the condition that the probability 
of ¢ & w is infinitely greater than the probability of ¢ & 7), as was demonstrated 
by Lehmann and Magidor (1992) in appendix B of their 1992 essay, “What does 
a conditional knowledge base entail?” (Adams (1975) is the locus classicus of 
this approach; see also Vann McGee’s very insightful paper, McGee (1994).) 

Lehmann and Magidor proved that the conditional logic VW is both sound 
and complete for an interpretation of the conditional in terms of non-standard 
probability theory. In such an interpretation, we assign a non-standard proba- 
bility space to each world in the model. Non-standard probability theory is an 
application of the work by Abraham Robinson (1966) on non-standard analysis. 
(For more details see Keisler (1976) and Cutland (1983).) We know from model 
theory that there are non-standard models of number theory: models in which 
there are non-standard natural numbers, numbers that are larger than any finite 
natural number. These model theoretic results extend, as Robinson showed, to 
real analysis. For example, the number hs where h is a non-standard natural 
number, is a non-standard, infinitesimal rational number. 

Suppose that we fix on a particular non-standard model of the real numbers, 
R* = (R*,+*,x*,<*,0,1). This model consists of an ordered field, together 
with a map « that takes members of R into members of R*, relations on R into 
relations on R*, etc. More precisely, * is a function from the superstructure 
of R into the superstructure of R*, where the superstructure Voo(X) is defined 
recursively as follows: 


e Vo => X, 
© Vn4i = P(V,) UVn, where P(Z) is the power set of Z. 


The map * is such that, for every « € R, x* = x, and for every bounded 
formula 4, 


RE dfai,...,@n] iff R* - dfaj,...,a%] 


This latter correspondence is known as “the Leibniz principle.” The Leibniz 
principle guarantees that the non-standard counterpart of any standard notion 
shares all of its important (measure-theoretic) properties. The set N* is the 
counterpart of the natural numbers. The non-standard natural numbers are 
those that belong to N* but not to N. 

An object x in the superstructure of R* is called internal if x € y* for some 
y € V.(R). The set of internal objects of R* is designated Vi. An internal 
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object A is hyperfinite just in case there exists a function f € V* and a number 
h € N* such that f is a one-one mapping of h onto A. In other words, the 
hyperfinite sets are those that are treated as though they were finite in the 
non-standard model. 

An R* probability space is a triple (X, F, Pr), where X is a nonempty set, 
F is a Boolean sub-algebra of P(X), and Pr is a function from F into R* that 
meets the usual requirements on a probability function, namely: 


e Pr(A) > 0, for all A € F. 

e Pr(X) =1. 

e Pr(AU B) = Pr(A)+Pr(B), for A and B disjoint members of F. 
Conditional probability can be defined in the usual way: 


Pr(An B) 
Pr(A) 

A non-standard probabilistic model of conditional logic is one in which an 
R* probability space is assigned to each world in such a way that the field F 
includes every set of worlds definable in the language, and in which the truth 
conditions of the conditional O— are defined as follows: 


Definition A.1 (Non-standard probabilistic truth conditions) M,w | 


(gO w) if and only if1—Praw(|lPI|/||@l|) is infinitesimal, or Prvyw(\|l|) = 
0. 


Pr(B/A) = 


We can define probabilistic consequence, /p, in the usual way: T -p, A if 
and only if every non-standard probabilistic model that verifies every member of 
T also verifies some member of A. Lehmann and Magidor have proved that the 
relation of probabilistic consequence is captured by the logical system VW. 
The rules and axioms of VW~ consist of the following: 


e (RCEC) From | (¢ © w) to infer + (xO ¢) © (xO y). 


e (RKC) From! (¢,&... & dn) — w to infer F [(x¥O— 1) &...& (xO 
on)| > (xO ). 


e (Id) F (gO 9). 

e (Mod) + O¢ > (pO— ¢). 

e (CSO) F [(¢0- Y) & (PO— ¢)] — [(60— x) & (O- x). 
e (CV) F [(¢0 p) & (GD >x)]  [((¢&x)O- YJ. 


Theorem A.1 (Lehmann and Magidor 1992) For countable language L, 
T Ep, A if and only iff tyw- A.! 


1Strictly speaking, Lehmann and Magidor proved the theorem for a language L that con- 
tains no nested conditionals. However, their result can be easily extended to the more general 
case by simply assigning an #* probability space to each world in the model, instead of 
constructing a single space for the entire model, as Lehmann and Magidor do. 
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In their proof of the completeness theorem, Lehmann and Magidor show how 
to construct, given a consistent theory [T of VW~, and given a non-standard 
model of analysis R*, an R* probabilistic model M such that MT. 

There are two model conditions that Lewis imposed in his work on counter- 
factuals that no longer make sense when the spheres are interpreted in terms of 
qualitative probability. These conditions are strong and weak centering. Weak 
centering means that the smallest sphere in the system associated with world 
w must contain w. Strong centering means that the smallest sphere associated 
with w must contain nothing but w. These two requirements correspond to two 
axioms of Lewis’s system: 


(CS) ((¢& y) — (¢0-> #)) 


(MP) ((60- ) & ¢) — ¥) 


Neither of these axioms is valid when the conditional O— is interpreted as 
a fainthearted conditional. Thus, nothing corresponding to them will show up 
in my partial conditional logic. 

For simplicity’s sake, I will not make use of Lewis’s system of spheres seman- 
tics. In its place, I will use the more flexible set-selection function semantics. 
In this case, each model contains a set-selection function f. This function takes 
two arguments: a world, and a set of worlds. The function’s output is always a 
set of worlds. In classical conditional logic, we define ||¢|| to be the set of worlds 
in the model that verify ¢. The truth conditions for the conditional can then 
be given very simply: 


M,w ($0 $) Ve € f(w, |I6ll) Moz Ev 


Under the interpretation by means of extreme probabilities, we should think 
of f(s, ||||) as the intersection of all of the propositions whose probability, 
conditional on ¢ is, from the perspective of s, infinitely close to 1. Various logical 
properties of the conditional can then be represented by placing corresponding 
conditions on the selection function f. For instance, the probabilistic Adams 
conditional requires the following conditions: 


1. f(s, A) CA. 
2. f(s, AUB) C f(s, A)U f(s, B). 

3. If f(s, A) C B and f(s,B) C A, then f(s, A) = f(s, B). 
4. If f(s, A) B&O, then f(s, ANB) C f(s, A). 

5. f(s, A) ¢ Ri{s}]. 


The first condition guarantees that all of the normal A worlds are indeed 
A worlds. The second condition ensures that if x is extremely probable on 
ov ~w, then it must be extremely probable on either ¢ or on w (or both). 
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The third condition indicates that probabilistically equivalent propositions can 
be substituted for one another in conditional antecedents. The fourth ensures 
that if the probability of ¢ on vy is finite, then any condition that is extremely 
probable on condition w is also extremely probable on condition ¢ & w. Finally, 
the fifth condition guarantees that impossible worlds have zero probability. 

In moving from classical to partial conditional logic, two changes are needed. 
First, we must replace the set of worlds with a set of situations, to which we 
assign three- or four-valued interpretations of the atomic formulas. Second, 
we must replace the single selection function of the classical model with two 
selection functions, f1 and f!. When we apply f! to a situation s and a set 
of situations A, we get as output the set of situations that are not definitely 
excluded from the set of the most normal or probable A-situations. When we 
apply f! to a situation s and a set of situations A, we get as output the set of 
situations that are definitely included in the set of the most normal or probable 
A-situations. If s C s’, then, for every set A, 


f(s, A) S f(s’, A) 
f"(s',.A) & f(s, A) 


These conditions guarantee that the truth values of the conditionals are 
persistent. In partial logic, we can define the set ||¢||' to be the set of situations 
in the model that do not falsify ¢. The truth and falsity conditions of the 
conditionals in partial logic are the following: 


M, 8 — (¢0— v) & Va € f"(s, ||d||") M, 2 KK we 


M,s F ~(¢0> p) © ar € fi(s, |Id||") M,2 & 


There are some standard conditions on the set-selection function that we 
must impose. These conditions, like their analogues in the classical case, are 
supported by our interpretation of the selection function in terms of qualita- 
tive probability. However, just as we could not characterize reflexivity in par- 
tial modal logic, we cannot characterize three of the conditions that constitute 
Adams’s logic. Thus, I will impose analogues only of conditions 1 and 2. In ad- 
dition, we need a third condition, P3, that represents a special case of condition 
4. P3 is needed to validate the contrapositives of the rules validated by P1. 


© (P1) f(s, AUB) C (f1(s, A) U f(s, B)), and f#(s, AUB) C (f#(s, A) U 
fi(s, B)). 


e (P2) fl(s, A) © RI[{s}], and ft(s, A) C Ri [{s}}. 


e (P3) fl(s, A) C fl(s, AUB) or fl(s,B) C fl(s, AUB), and f!(s, A) C 
f!(s, AUB) or f(s, B) C f(s, AUB) 
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The canonical model for partial conditional logic is similar to that for partial 
modal logic. In the case of four-valued logic, the class of situations in partial 
conditional logic is the set of saturated theories of the logic. The two set- 
selection functions in the canonical model can be defined as follows: 


e f1(s,\|@l|") = {TP € Sit: {Ws '(d0> py € s} NT =9} 
© f(s, ||d||") = {IP € Sit: {4p : “(dO 7p)’ € 5} CT} 


For four-valued interpretations, the conditional logic can be axiomatized by 
the following rules, the system rC: 


e (C1) (¢0— x), (WO x) F (Vv ¥)O- x) 
(C2) =((6 V ¥)O— x) F =(¢0> x) V a(pO x) 
(C3) =(@0— x), a(YO— x) F a((6 V p)O— x) 
(C4) ((@V p)O— x) F (0 x) V (YO x) 
(C5) =(¢0— yp) F Ony 

e (C6) Od (pO ¢) 
(C7) 
(C8) 
(C9) 
( 


C7) (0 #), (¢0— x) F (60 (¥& x) 

C8) =(¢0— (xf & x)) F a(60 #) V (0 x) 
e (C9) If pt x, then (PDO w) k (@0— yx) 

e (C10) If pt x, then =(@0— x) k =(¢0— yw) 

e (C11) Og F =(¢0 -¢) 

e (C12) (¢0— =7¢) + On@ 


Rules (C1) and (C3) correspond to condition P1, and the contrapositives 
(C2) and (C4) correspond to condition P3. Rules (C11) and (C12) correspond 
to condition P2. The other rules hold in any model, without special conditions 
on the selection function. 

A partial version of Lindenbaum’s lemma can once again be proved, and by 
means of that lemma, we can prove the usual canonical model theorem, and, 
hence, the completeness of the calculus rC. 

Another way to establish the axiomatizability of the logic, as well as such 
properties as decidability, is to extend the translations + and — from partial 
conditional logic to classical conditional logic. All we need to do is add two 
clauses to the definition I gave in the last section: 


e (go pyr _ (-¢- O17 aw) 
e (¢@u-— w)~ = a(ag7 oOo! a7) 
Once again, a set of formulas I entails A in (four-valued) partial conditional 


logic if and only if the translations of I entail the translations of A in classical 
conditional logic. 
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A.3.1 Classical Conditional Logic 


The partial conditional logic of this section is a minimal, bare-bones logic. In 
particular, there are three model conditions of Adams’s conditional logic that I 
have not characterized proof-theoretically. 


1. f(s, A) CA. 
2. If f(s, A) C B and f(s, B) C A, then f(s, A) = f(s, B). 
3. If f(s, AJA BAO, then f(s, ANB) C f(s, A). 


These three conditions correspond (respectively) to the following axioms of 
the logic VW-: 


e (Id) (¢0- ¢) 
e (CSO) (¢0- y) & (YO ¢) = [(60- x) — (YO x)] 
e (CV) (90> x) & +(G0 ay) — ((P& p)O— x). 


The situation here seems to be similar to that of reflexive modal logics and 
the axiom T. We should not expect axioms like (Id), (CV), or (CSO) to be 
verified by every situation token, although we can expect them to be verified 
by every possible world, including the actual one. As situations gain in modal 
information, they will verify more instances of classical axioms like these. 


A.4 Partiality and Quantificational Logic 


Partial modal and conditional logics, as important as they are, are not in them- 
selves sufficient for formulating an adequate model of causation. In addition to 
the representation of necessity and objective probability, we must also be able 
to talk about the situations that are causing and being caused, and we must 
be able to represent some situations as parts of others. Consequently, in this 
section I will develop a partial quantificational logic. 

This quantificational logic will be different from standard quantified modal 
logics in that the individuals being named and quantified over will be indices 
(situations and worlds) and not ordinary substances (like people and organisms). 
One might think that quantification over situations makes the modal operators 
redundant, since we could define necessity by simply using a universal quan- 
tifier. However, replacing modality with explicit quantification over situations 
would eliminate a critical element implicit in the use of modal operators: the 
indexicality of modal properties. We could re-introduce this element of index- 
icality by adding a special indexical constant to our language, something that 
intuitively picks out “this situation.” The necessity of ¢ could then be defined 
as ¢’s holding in every situation accessible from this situation. However, this fix 
would introduce another problem: many formulas involving the constant this 
situation would be non-persistent. We could have the formula “7 does not hold 
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in this situation” holding in s but not holding in a strictly larger situation s’ 
that contains s. 

Thus, I will use a language that contains both modal operators (with their 
implicit indexicality) and terms and variables that stand for situations (and do 
so non-indexically). In addition to situation constants, variables, and quanti- 
fiers, I will add two new kinds of atomic formulas (t¢ and ?’ are situation constants 
and ¢ is any formula): 


e (t= ¢) 
e At 


We can define the part-whole relation E by means of these elements: 


(Ct!) Hae (t= At) 


In turn, the identity predicate can be defined in terms of CE: (t = t') =der 
(tC &t/ Ct). 

The first new kind of atomic formula is an object language counterpart to 
the verification relation between situation tokens and types. For simplicity’s 
sake, I will assume that the logic of these two kinds of formulas is entirely 
classical (bivalent). I will assume that if one situation is part of another, or if 
one situation verifies a formula, then these facts are supported in every situation 
in the model. For some purposes it might be useful to make situations partial in 
their mereological or classificatory information (for example, this might be very 
important in modeling certain propositional attitudes), but I have not found this 
additional flexibility necessary in dealing with the concept of causation. Thus, 
the truth and falsity conditions for these formulas are the following (where ||f]| 
represents the designatum of constant t in the model M): 


e M,sF (Et) & |ltl € [lel 

eo M,sF 7(¢ Ct’) & >((el] E [lel 
© Ms (= 4) eM, Ill ¢ 

© Ms “(t= ¢) o M, Ill Fo 


The atomic formula At means that the situation t is definitely part of the 
actual world (from the perspective of the indexed situation). Such a formula 
is verified by a situation s just in case the situation ||t|| is a part (proper or 
improper) of s. A model structure for the language must include a binary 
relation A~ to provide falsity conditions for the A predicate. 


e M,s- At © ||t|| Cs 
© M,s : -At © (s, |It|]) €.A7 
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To ensure that the logic of situations is axiomatizable, it is essential that 
we place certain conditions on the the relation A~. First, there is a fixed-point 
condition: whenever two situations verify contradictory formulas of any kind 
(including formulas involving A itself), the ordered pair of the two situations 
must belong to A~. Second, if a situation s does not support 7¢, then there 
must exists a token s’ such that s’ supports ¢ and (s, s’) does not belong to the 
relation A~. Third, if there is no situation accessible to situation s (either by the 
inner relation R! or by the outer relations R') that extends both s and s’, then 
the pair (s,s’) belongs to A~. Finally, the relation A~ must be mereologically 
persistent. 


1, (Fixed point condition) If M,s — ¢ and M,s’ - 7¢, then (s,s’) € A. 
2. If M, s JK 7¢, then there exists an s’ such that M, s’ — ¢ and (s, 8’) ¢ A7. 


3. (Modal condition) If m42(x € R'[{s}] UR! [{s}]&s C r&s' C 2), then 
(s,s’) € AW. 


4, If (x,y) € A~, and x Cz andy Cw, then (z,w) € A™. 


In the case of the quantifiers, the truth and falsity conditions reflect the 
usual extension of the Dunn truth tables. 


e M,s | Vr¢ © for every situation s’ in Sit, M,,s | ¢[t/z], where M,: 
is a model that differs from M only in assigning s’ to a new constant t ( 
a constant not occurring in @). 


e M,s — 7~Vr¢ © for some situation s’ in Sit, Ms,s - -¢[t/x], where 
M.,’ is a model that differs from M only in assigning s’ to a new constant 
t (a constant not occurring in ¢). 


The logic that corresponds to this semantical system will be called £35;,. In 
addition to the rules of the system rL for partial propositional logic, M/~~ for 
partial modal logic, and rC for partial conditional logic, we need to add, first, 
the following quantifier and identity rules: 


© (Q1) Ved F oft/z] 

© (Q2) olt/z] F Arg 

e (Q3) If g[t/z] and t does not occur in T or ¢, then D+ Va¢. 

(Q4) If’, d[t/z] + A, and ¢ does not occur in, ¢ or A, then T,Arét A. 
(Q5) F VaVy(2 = y © ((2|= Ay) & (y[= Az)) 

(Q6) + Va(r|= Az) 

(Q7) dF Va(2 = t — d[x//t]), where d[2//t] is the result of replacing one 


or more occurrences of t in @ with x. 
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e (Q8) Vad H dard 
© (Q9) -=Ar¢ H Vend 


In order to capture the classicality of atomic formulas involving |=, we must 
add the following two rules: 


e (Q8) IfT classically entails A, and the atomic formulas in T and A involve 
only |=, then PF A. 


© (Q9) (t= ¢), ~Gl= o) FY 


Rule (Q10) connects impossibility with the support of non-actuality. Since 
all the formulas of our language are persistent (with respect to the part-whole 
ordering), we must add rules that ensure that if a situation t is part of the 
current index, and t supports formula ¢, then the current index also supports 
@. Also, if the current index supports ¢, and t supports ~@, then the current 
index supports —At. Thirdly, every situation verifies the formula that it is 
actual. Rules (Q11) and (Q12) guarantee that actuality has the right kind of 
fixed-point character. 


e (Q10) At, O-(At & At’) t “At! V O(At & AAt’) 
© (Ql1) 6H Ax(Az & (z|= 4)) 
© (Q12) nd H Va((z|= 6) + Ar) 


The rules governing the support relation |= ensure that the formulas sup- 
ported by a situation form a saturated theory. 


© (Q13) F Vx((z|= Vy) — Vy(2|= ¢)) 
(Q14) F Va((2|= (¢& p) © (2|= 6) & (z= Y)) 
e (Q15) F Va((x|= Jy¢) > dy(2|= ¢)) 
( 
( 


© (Q16) F Va((2|= (@V P)) © (2/= 6 V 2l= ¥)) 
Q17) If ét y, then (t|= ) + (t= ¥). 


I will assume that all mereological and classificatory facts are supported by 
all tokens, and that they hold of necessity: 


¢ (Q18) F VeVy((2|= (y= 4)) (VE 4)) 


e (Q19) F VaVy((2l= ~(yl= )) >= 4)) 
e (Q20) O61 6H O¢, where ¢ contains only |= atoms. 
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Finally, I will stipulate that the domains of quantification associated with all 
situations are the same. In each case, we will be quantifying over all situations, 
actual and non-actual, possible and impossible. We can of course express the 
actuality of a situation by the formula At and its possibility by ©At. Since the 
domains of quantification are constant, we can add two rules corresponding to 
the Barcan and converse Barcan formulas of standard quantified modal logic. 


e (Q21) VrOd H OVzrd 
e (Q22) 4rOGH OAzre 


In a canonical model for the logic Lj, the set of situations consists of a 
certain set of supersaturated theories. A theory I is supersaturated if and only 
if it meets the three conditions: 


1. If Ved ¢T, then for some t, ¢[t/x] ¢T. 
2. If Sed ET, then for some t, ¢[t/z] ET. 
3. If (@V w) ET, then either ET or PET. 


To prove the completeness of the logical rules, we must show that if Tb A, 
then T | A. To begin with, we will add a new constant, ¢*, to stand for the 
current index, the situation token at which the types in I are verified, and 
the types in A are not verified. For each type ¢@ in I’, we shall add the type 
(t*|= @), producing a new set, +. We can easily prove a lemma to the effect 
that if A, then + / A, using rules (Q4) and (Q14), given the fact that t* 
does not occur in T. 

We can then easily prove a generalization of the partial version of Linden- 
baum’s lemma: 


Lemma A.2 (Generalized Lindenbaum’s Lemma) /f [+ }/ A, then T'* 
can be extended to a supersaturated theory T'* such thatI*N A= 9. 


This can be proved by the construction of a series of pairs (T';, A;), such 
that T; F Ai, T+ CT;, A C Ay, and T;N A; = 9. The construction is based, 
as usual, on an enumeration of the language in which every formula comes up 
infinitely often. When we reach formula ¢;, we follow the following rules: 

1. IfT,+ di; then do: E Tigi. 

2. Ifo, + Ai, then d; € Ay). 

3. Otherwise: if d; = y[t’/v], dud € T.,Ti,d: Y Ai, and there is no ¢t such 

that g[t/v] E€T,, then ¢; € Ty41. 


4. If d; = wt’ /vj, Vu € Ai, T; Y Ai, d:, and there is no t such that ¢[t/v] € 
A;, then dE Aust. 
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We can then let [* =[. It is easy to show that T CI*,T*N A = @, and 
that [’* is a supersaturated theory. 

This supersaturated theory will contain a model base B(I*), consisting of a 
consistent and complete assignment of classical truth values to the mereological 
and classificatory atomic formulas (those containing C and |=). In the canonical 
model for the logic £5; based on I*, the set of situations consist of all the 
supersaturated theories that extend the model base B(I*). 

In this canonical model, we can associate each term t with a canonical 
situation-token (a supersaturated theory) in Sit,4, in the following way: 


l|tl| = {6 : (l= 4)’ € BUI")} 


The supersaturation of [* and axioms (Q13) through (Q17) ensure that 
each of these associated sets is a supersaturated theory. Rules (Q18) and (Q19) 
guarantee that these theories also extend B(I*) and so are all members of 
Sitma.- 

The part-whole relation E in the canonical model is given by the subset 
relation between theories in Sit,,,. The interpretation function J assigns truth 
values to simple atomic formulas by reference to the inclusion or non-inclusion 
of the formula and its negation in each theory. 


1. ae To '$' € sy & ‘ae’ ¢ 8 
I(s)($) =F 4 ‘59’ € sy & 8 Zs 
I(s)(¢) =U 2 ‘Pg sy & ad’ ¢ 8 
4. I(s)(@) =Be¢ esp kd Es 


It is then straightforward to prove the usual truth theorem for the canonical 
model: a formula (whether atomic or complex) is at least true at a situation 
theory just in case it is included in the situation-theory, and it is at least false 
just in case its negation is included in the situation theory. The situation corre- 
sponding to the special constant t* has been constructed so as to include every 
member of I’ and exclude every member of A. From the truth theorem for 
the canonical model, it follows that T A A. This suffices (since T and A were 
arbitrary sets of formulas such that T lf A) to prove the completeness of the 
inference rule set. 

An alternative route to this completeness result is to make use again of the 
Gilmore-Feferman technique of translating partial logic into classical logic. In 
the case of £5;z, the translations + and — must be extended as follows: 


e (CUT =C¢C?) 
e (t= 4)* = (t= ¢) 
e (At)+ = Att 

o (Vad)+ = Vadt 
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fee) P(e) 


Muskens has proved that [ entails A in partial predicate logic if and only 
if [+ entails At in classical predicate logic (Muskens, 1995, p. 54). As corol- 
laries of this result we have partial versions of the compactness theorem and 
the Lowenheim-Skolem theorem, and a second proof that partial predicate logic 
can be recursively axiomatized. In addition, the translation enables us to trans- 
fer other standard results in classical predicate logic, such as the Lowenheim/ 
Skolem theorems. 


A.5 First-Order Quantification over 
Situation-Types 


I take the formulas of our language to correspond to something in the world: 
what situation theorists like Barwise refer to as “situation-types.” These 
situation-types are closed under such logical operations as negation, conjunc- 
tion, disjunction, and generalization. In addition, there are specifically modal 
and conditional situation types, corresponding to formulas containing O and 
>, 

Situation-types are needed to account for certain kinds of causal connections, 
in particular the connection I will call causal explanation. Unlike many Platon- 
ists, but like some contemporary realists such as Armstrong and Hochberg, I 
do not insist that every meaningful predicate or open formula of our language 
must correspond to a situation-type. There are purely formal or logical proper- 
ties and relations that can be defined but may not correspond to a real type in 
the world. In this way, I can avoid Russellian paradoxes involving properties, 
like the supposed property of heterologicality (the property of being something 
that cannot be truthfully predicated of itself). 

Although I am a realist about types and think of them as universals (things 
that can be multiply instantiated in the world), this realism about universals 
is not essential to the program I am undertaking in this book. If one prefers 
to think of atomic situation types as natural classes, classes bound together by 
especially close relations of natural similarity, as does, for example, David Lewis 
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(1983), I have no deep-seated objection. I am inclined to think that relations of 
natural similarity must be grounded in the co-instantiation of some universal, 
rather than the other way around, but I am not confident of having a definitive 
argument for settling this ancient dispute. 

There will be no need in this volume for any quantification over types, and 
relatively little need for it in the next volume. However, in chapter 15, in devel- 
oping a Platonistic theory of the natural numbers, I did make use of quantifica- 
tion over types. I do not, however, require the full force of standard second-order 
quantification, in which the second-order variables are taking as ranging over 
arbitrary classes of the first-order domain. Instead, I can make use of what is 
essentially a first-order theory of situation-types. Since I distinguish between 
real and merely logical types, my logic will not include an axiom of abstraction, 
positing the existence of a type corresponding to every open formula of the 
language. Lambda abstraction will not be assumed to lead unfailingly to the 
specification of a real situation-type. Thus, I need not accept heterologicality, 
defined as Aw + (|= :r). 

We can take situation-types to be a special class of entities, closed under 
certain logical operations. We do not have to think of the logical and modal 
operators as literally parts of “complex” types. Instead, we can take negation, 
for example, to be a particular relation between types. Types that are logically 
equivalent (in partial logic), such as ¢ and =~¢ or ¢&wW and w&¢@, may, if 
we wish, be identified with one another. We also do not have to reify the 
variables of predicate logic as parts of generalized types. Instead, we can take 
substitution to be a ternary relation (definable recursively) between two types 
and an individual token, and we can take existential generality to be another 
ternary relation between two types and an individual, definable in terms of 
substitution. 

Quantification over types can be enabled simply by taking the set of types 
to be a subset of the set of tokens. The same first-order variables can then be 
taken as ranging both over ordinary particulars and over abstract types. 


Appendix B 


A Causal Calculus 


B.1 Causation and Projectible Statistics 


The power of causation comes from its impact on statistical inference. We 
can now specify exactly what impact causation has on projectible statistics by 
formalizing the principles known as “Markov’s rules.” Markov’s principles tell 
us that when one fact a screens off one of its effects b from some other fact 
c, then 6 is statistically independent of c, given a. In standard treatments of 
causal inference, Markov’s principles are formulated for the special case in which 
the causes and effects are random variables. We need to formalize Markov’s 
principles for the general case, in which causes and effects are represented by 
types of arbitrary logical complexity. 

This task of formalizing Markov’s principles depends on expressing a relation 
between types of situations, and not merely a relation between situation-tokens. 
I defined such a relation between situation-types in chapters 4 and 5. In this 
appendix, I will illustrate the advantages of such an account. 

For simplicity’s sake, I will assume that all of the situation-tokens in the 
relevant class of models are modally complete and coherent. Consequently, | 
will make use of a fully classical, bivalent conditional logic, the logic VW~, 
which includes the following axioms and rules: 


e (RCEC) From F (¢ © y) to infer k (yO> ¢) - (xO ¥). 


e (RKC) From + (¢,&...&¢, — w) to infer [(y¥O-> ¢:)&..&(x0- 
gn)] > (xO> ¥). 


(Id) F (¢0- ¢). 

* (Mod) + Og — (pa ¢) 
( 
( 


CSO) © [(@0 ) & (YO 6)] = [(d0— x) © (WO x). 


CV) F [(d0-> p) & -(¢0— >x)] > [(@&x)O> YI. 
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The logic VW~— corresponds to the interpretation of the conditional in terms 
of extreme probabilities proposed by Ernest Adams (1975) and Judea Pearl 
(1988) (as I discussed in A.3). (See also Lehmann and Magidor (1992).) This 
logic corresponds to David Lewis’s logic VW, minus the MP axiom (that is, 
minus (¢& (¢0- )) > w ). 

Once we have formalized such generalized Markov principles, these principles 
can enable us to specify a nonmonotonic logic that is adequate to the task of 
reasoning about dynamic situations.. Consider, for example, the infamous Yale 
Shooting Problem, due to Hanks and McDermott (1987). I will formalize the 
problem, using the modal/statistical conditional O— in the statement of the 
defeasible rules. We start with a set of facts, {A,S,L}, standing for the fact 
that the victim is initially alive, the gun is initially loaded, and that an event of 
shooting takes place (after a short period of waiting). We have three defeasible 
rules: 


e (AD A’) 
e (LO L’) 
© (A&KL'& S)O- =A’) 


The first two of these rules are instances of the so-called law of inertia. The 
third rule is a causal law specifying that the firing of a loaded gun overrides 
the inertia of being alive, resulting in the victim’s being dead in the succeeding 
state. In standard nonmonotonic logics, this set of facts and rules has two 
permissible extensions. In one of these extensions, the gun stays loaded and the 
victim is killed. In the other, the gun mysteriously becomes unloaded before 
the shooting, and the victim remains alive. Each of these scenarios involves the 
occurrence of something unexpected, either the overriding of the inertia of being 
alive, or the overriding of the inertia of being loaded. 

By taking into account the causal structure of the situation, we can apply 
Markovian rules to derive a principled solution to the problem. Consider the 
following causal structure diagram. 

Notice that LZ screens off A and —A’ from L’. This means that, by means 
of Markov’s principle, we can strengthen the antecedent of the law of inertia as 
applied to L, resulting in the new rule: 


((L& A&A')O- L’) 


This rule now takes priority in many nonmonotonic logics (such as Pearl’s 
System Z or Asher/Morreau’s Commonsense Entailment) over the law of inertia 
as applied to A. This means that we can throw out the second scenario, and we 
can successfully infer that the victim is not alive. 

Suppose that we had instead the following causal structure, where Z repre- 
sents the presence of zealous police protection for the would-be victim. 

The presence of these zealous bodyguards is causally prior both to A (since 
it is a possible explanation of the victim’s surviving up to the present time) and, 
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A’ 


Figure B.1: The Yale Shooting Problem 


let us assume, to L’ (since the body guards have access to the loaded gun during 
the quiescent interval). In this case, L no longer screens A and —A’ off from L’, 
and we can no longer apply Markov’s principles. In this case, we cannot infer 
that the victim is dead, which seems to be the intuitively correct result. 

Of course, we were not told in the original story that there was no such po- 
lice bodyguard. Apparently, nonmonotonic reasoning about dynamic situations 
involves two processes: first, assuming the minimality of the causal structure 
of the current situation, and second, using that causal structure (in combina- 
tion with Markov’s principles and various defeasible rules) to infer the probable 
consequences. 


B.2 Some Other Well-Known Puzzles 


Judea Pearl (1988) discusses a very simple problem illustrating the necessity 
of introducing causal information into the formalization of common-sense rea- 
soning. Consider a sprinkler and a sidewalk. We have two defeasible rules: 
(Sprinkler-onO—+ Wet) and (WetO—> Rain). Suppose we know that the sprin- 
kler is on. We can reasonably conclude that the sidewalk is wet. However, if 
we go on to infer that it probably rained in the recent past, something has gone 
wrong. We need to make some sort of use of the fact that both the sprinkler’s 
being on and the rain are causally prior to the wetness of the sidewalk. 
Another well-known problem is that of the lamp, discussed by Vladimir 
Lifschitz (1990). Suppose we have a lamp connected to a pair of switches. If 
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Figure B.2: The Yale Shooting Problem, With Bodyguards 


both switches are up, or both are down, then the lamp is on; otherwise, it is 
off. Suppose that the first switch is down and the second is up. The lamp is off. 
Now, suppose we flip the first switch up. We want to derive the consequence 
that the lamp comes on. However, a result that is equally in accord with the 
defeasible rules is one in which the second switch moves down. Again, we need 
to take into account the fact that the state of the lamp, but not that of the 
second switch, is causally posterior to the position of the first switch. 

Finally, there is the problem of the emperor’s colored blocks, proposed by 
Lin and Reiter (1994). There are two red blocks. The emperor has decreed 
that either both blocks or neither shall be yellow at any time. Suppose we try 
to perform the action of painting just one of the blocks yellow. The correct 
conclusion to draw is that the action will fail. We must somehow avoid the 
conclusion that painting the first block will, in and of itself, cause the second 
block to become yellow as well. 


B.3 Screening Off 


I will now define the relation, represented by o(s1, $2, 83), according to which 
one token is screened off from a second by a third. I will make use of two new 
abbreviations: P,, and N,,. P,,(s) is the sum of the tokens that are parts of 
world w and immediately prior to token s, and N,,(s) is the sum of the tokens 
that are parts of world w and immediately posterior to token s. 

Throughout these definitions, I will assume the thesis I have argued for in 
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chapters 5 and 8, namely, the thesis that the existence of a situation-token 
necessitates the existence of every token causally prior to it. If we drop this 
thesis, we must take into account the possibility of counterfactual, non-actual 
causes of actual token events. The causal antecedents of a given token would 
then include non-actual, as well as actual, prior tokens. In order to screen off the 
probability of the effect, we would have to have complete information about the 
occurrence and non-occurrence of each of its causal antecedents. It seems that 
this is wrong: it is sufficient to take into account the occurrence of the actual 
causes of an actual event. If so, we must postulate that actual events have no 
non-actual causal antecedents, and this postulation, in turn, would make sense 
only if each event necessitates all of its causal antecedents. 


Definition B.1 (Backward Causal Chain) A sequence of tokens c is a back- 
ward causal cone relative to w, starting from s, By(c, s) if and only if following 
conditions are met: 


1. c(0)=s 
2. Vi < dom(c)c(i + 1) = Py (c(i)) 


Definition B.2 (Forward Causal Chain) A sequence c is a forward causal 
cone relative to w starting from s, Fy,(c,s) if and only if following conditions 
are met: 


1. c(0)=s 
2. Vi < dom(c)c(i + 1) = Nw (e(i)) 


Definition B.3 (Common Cause) Token s; is a common cause of s2 and sz, 
CC(s1, 82,83), if and only if: there exist sequences c and c’ and world w such 
that the conditions By(c, 82) and By(c’,s3) hold, together with the conditions 
8; Ce(t) and s; Cc'(j), for some i and j. 


Definition B.4 (Trajectory) Situation s, stands athwart the trajectory from 
82 to s3 Tr(s1, 82,83), if and only if: there exist sequences c and c’ and world 
w such that the conditions F,,(c, $2) and By,(c', $3) hold, together with the con- 
ditions s; C c(i) and s, Cc'(j), for some i and j. 


Definition B.5 (Causal Screening Off) Token s; is screened off from sq by 
$3, 0(81, $2, 83), if and only if s3 ~ sg and: Vx(CC(a, $1, 82) — Tr(s3, 2, 82) 


In other words, s; is causally screened off from sg by sg just in case s3 is 
prior to sg and s3 stands athwart the trajectory from any common cause of 81 
and s9 to Sq itself. 
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B.4 Conditions on Hyperfinite 
Probability Functions 


The system-of-spheres semantics corresponds to an interpretation of the O—- 
conditional in terms of extreme probabilities: 


A<B 6 Pr(A/B) =1 


When spheres A and B belong to a Lewis system S, and A C B, this can 
be interpreted as representing the situation in which the probability of A, given 
B, is infinitely close to 1. Tim Fernando (1998). has recently demonstrated 
that any model with a finitely additive hyperreal-valued probability function is 
elementarily equivalent to such a system-of-spheres model. 

There are four additional conditions that must be imposed upon the hyper- 
real probability function: 


1. Miller’s principle, a principle of higher-order probability enunciated by 
Brian Skyrms. 


2. Markovian locality. 
3. Reichenbach’s rule. 


4, Occam’s razor. 


B.4.1 Miller’s Principle 


Miller’s principle requires that the first-order probability weights can be recov- 
ered from higher order probabilities through integration. In fact, I believe that 
facts about modality, including facts about normality and objective chance, are 
necessary. In this case, we should adopt the extension of S5 to conditional logic, 
holding both of the following axioms: 


(g0- 4) — O(¢0- ¥) 
(90> Y) > O(GO— #) 


These $5-like axioms entail Miller’s principle. Hence, if we wish to be very 
cautious, we can adopt Miller’s principle as a minimal constraint on the relation 
between first-order and higher-order modalities. 

Miller’s principle can be stated as follows: 


Hypothesis B.1 (Miller’s Principle) Let [W],, be the partition of W by prob- 
abilistic agreement, i.e., we, w! iff Vw" pwy(w”) = bw (w”). 
If AC [W],, then: 


wie 


pu(c/B ay) = [ Jay tat CIB) 
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In the case of finite models, the consequent of Miller’s principle can be 
represented as: 


bw (C/BALJ(A)) = D2 tur (C/B) uw (w') 
wel (A) 
The following axioms, which I call “Skyrms’s axioms,” are the proof-theoretic 


analogue to Miller’s principle. They are restricted forms of what is often called 
absorption. 


(S1)(@ 0 (PO x)) © ((6& ¥) O- x) 


(S2)[(¢O (f O- x)) KO(G&Y)] — ((G&Y) O- x) 


where K is in each case a Boolean combination of T and O—-formulas. 
The operator O— can be defined in terms of O— in the usual way: 


(¢O— W) =a =(60— =) 


Miller’s principles have a number of other interesting implications for con- 
ditional logic. The following theorems of conditional logic exploit some of this 
power: 


Theorem B.1 (Implications of Miller’s Principle) Miller’s principle en- 
sures the validity of the following two azioms: 


(a) F (TO (¢0- )) & (¢0- 9) 

(b) F ((¢& (0 ))0> ) 

Proof: 

(a) By axiom $1, we have that (TO-— (@0— 7)) is logically equivalent to 
((T & ¢)O— w). Since ¢ is logically equivalent (in classical logic) to T & ¢, we 
have the desired result. 

(b) By idempotence, we have: 


= ((g0 p)O- (gO ¥)) 


By axiom S1, we derive the theorem. QED 

The second theorem is very important, since it suggests that we can fruitfully 
define a kind of nonmonotonic consequence in terms of the logical properties of 
the O— conditional. 


Definition B.6 (Nonmonotonic Consequence) 


ple b <=> F (60> ¥) 
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Given the probabilistic interpretation of O—, this definition stipulates that a 
conclusion is nonmonotonically derivable from a set of premises just in case the 
probability of the conclusion, conditional on the conjunction of the premises, is 
infinitely close to one. We have, thus, a very well-motivated norm of nonmono- 
tonic reasoning: defeasible inference is just a special case of Bayesian condition- 
ing, for which we have Dutch-book arguments as justification. 

Furthermore, the consequence relation so defined is cumulative (in Gabbay’s 
sense) and preferential (in the sense of Kraus, Lehmann, and Magidor). It obeys 
such principles as Cut, OR, and Cautious Monotonicity. 

The second theorem of the Skyrms system gives us a defeasible form of 
modus ponens: 


o& (0 p)lav 


In order to derive a particular conclusion nonmonotonically from a set of 
premises, it is sufficient to demonstrate that the premises are logically equiva- 
lent to the left side of this defeasible- MP schema. What is needed to effect this 
transformation are equivalence-preserving principles governing the strengthen- 
ing of the antecedent of O— conditionals. For example, suppose we have the 
premises ¢,,(@0— y). We can nonmonotonically derive the conclusion x by 
defeasible MP just in case we can prove that (@0— x) is logically equivalent to 
((¢&w)O— x). This can be done if, for example, ¢ logically entails . It can 
also be done if the premises logically entail 6O— w. However, to get interesting 
nonmonotonic consequences, we need much stronger rules for finding logically 
equivalent sets of propositions in which the antecedent of some conditional has 
been strengthened in the second set. The principles of Markovian locality and 
Occam’s razor will give us such rules. 


B.4.2 Markovian Locality 


In the chapter on the indeterministic model of causation, I already introduced 
the principle of probabilistic locality. This principle requires that the probability 
of the actuality of a given token is independent of the actuality of any non- 
posterior token, given the actuality of its immediate cause. (In this definition, 
I make use of the mereological sum operator, #¢, the mereological sum of all 
the tokens supporting type ¢.) I will also make use of the symbol +, causal 
non-posteriority, defined as follows: 


Definition B.7 (Causal Non-Posteriority) 
(s*s’) =der 7As"(s" EC s&s’ < 8”) 


Hypothesis B.2 (Probabilistic Markovian Locality) If 5s; = 2(x <p 82), 
(8381), (sa>81) and s,, 83, and sq are compossible, then, for any world w, 
Pr, (As2/A(s1) & A(s3) & A(s4)) = Pry(Asq/As; & As4). 


Hypothesis B.3 (Reichenbach’s Rule) /f s; and sq have no common cause, 
and neither is prior (even in part) to the other, and s3 is not posterior to both 
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$1 and 82, then for any world w, Pry(As, & Aso/As3) = Pry(As,/As3) x 
Pr.,(As2/As3). 


Theorem B.2 (From Causal to Probabilistic Screening Off) 
If o(81, 82,53) and (s4>81) then Pry,(As2/As; & As3 & As4) = 
Pr,,(Asq/As3 & Asq). 


Theorem B.3 (Soundness of Modular Inferences) The following are log- 
ically equivalent, given probabilistic (Markovian) locality and Reichenbach’s rule 
(where ® is any modally closed formula): 


({® & o(dy, do, dg) & Ad3]O— Adz) 


(© & o(dy, dy, dg) & Ad; & Ad3]O— Adz) 


Proofs of these theorems appear in section B.7. 


B.4.3. Occam’s Razor 


Inductive inference is inference to the simplest, most economical explanation. 
This preference for simple hypotheses reflects a logical requirement on prob- 
ability functions, one that incorporates Occam’s razor: a rational probability 
function gives infinitely greater probability to propositions that entail fewer 
tokens, fewer causal connections, or fewer types classifying a given structure. 

The requirement of Occam’s razor can be formalized in three steps. First, 
we must define a partial ordering on worlds: 


ww’ s 
df: w to) Vs1, $2 C wl(s1 ~ so — f(s1) ~ f(s2))& 


Vo(s1 = 6 > f(si) F $)) 


A world w is weakly preferred to a world w’ just in case there is a structure- 
preserving homomorphism from the parts of w, w, into the parts of w’. Strict 
preference, ~<, is defined in terms of weak preference in the usual way. 

Next, we extend this partial ordering to sets of worlds: 


AxBevu € Biv'Aw’ <w 


A set A is weakly preferred to set B just in case for every member w of B 
there is a member w’ of A such that w’ is weakly preferred to w. 


Definition B.8 (Occam’s Razor) 
A model satisfies Occam’s razor if and only if Wwe WyVAVB(A ~ B& 


BC dom(Prmw) > wep Prmw(w’) © Vyesua Prmw(w’)). 


1A formula is modally closed just in case every occurrence of an actuality type At occurs 
within the context of a modal operator. 
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Causal minimization means that we can assume that where our premises are 
silent about the existence of causal connection or the supporting of a type by a 
token, the causal connection does not exist, and the type is not supported. This 
maximizes the extension of the screening-off relation and thereby maximizes the 
nonmonotonic consequences that are legitimated. 


Theorem B.4 (Soundness of Causal Projection) [f every maximally pre- 
ferred world in any model that verifies a modally closed formula © also verifies 
a(d,,d2,d3) in that same model, then the following inference is nonmonotoni- 
cally correct (in the class of models satisfying Occam’s Razor, Miller’s principle, 
Reichenbach’s rule, and Markov locality): 


&, Ad), Ad3, (Ads0—> Adg)|~ Adz 


This theorem provides a paradigm for causal projection. First, we use Oc- 
cam’s rule to minimize the causal connections between situation-tokens, maxi- 
mizing the extension of the screening-off relation. We then apply Markov locality 
and Reichenbach’s rule to strengthen the antecedents of the nonmonotonic con- 
ditionals in our premise sets, until the antecedents contain every premise that is 
not modally closed. Finally, we use Skyrms’s axioms to derive a nonmonotonic 
conditional whose antecedent contains all of the premises and whose conclusion 
contains the desired conclusion. This demonstrates that the inference is non- 
monotonically correct, that the probability of the conclusion, conditional on the 
premises, is infinitely close to one. 


B.5 Examples 
B.5.1 The Yale Shooting Problem 


Using causal minimization, we can assume that the loaded state of the gun 
just prior to the shooting is screened off from the state of the victim’s being 
alive by the act of loading the gun. This enables us to prove that the premises 
are logically equivalent to a set in which all of the facts are built into the 
antecedent of a O— conditional, with either the fact NLoaded or —N Alive in 
the consequent. Hence, we derive the correct nonmonotonic consequence. This 
solution generalizes to any similar problem: we do not have to assume that the 
facts given are atomic, or that we have information only about the initial state. 

In the modified YSP, where we add a state of type Z that is causally prior 
to both the state of type NLoaded and that of type Alive, we can no longer 
assume that no such connection exists. This blocks the application of Markov 
locality, resulting in our failure to show that logical equivalence is preserved 
when the antecedent of the conditional is strengthened. Hence, we no longer 
derive the conclusion that the victim is dead, which seems reasonable in light 
of the possibility of interference by a common cause, Z. 
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B.5.2  Pearl’s Sprinkler Problem 


Judea Pearl created a very simplified version of the Yale Shooting Problem. 
There are two token constants, d}, and dg, with the fixed relation d; ~o do, 
i.e., dy immediately precedes dg. We stipulate that (d,|= Sprinkler — on) and 
(d2|= Sidewalk — wet). We know that d, is actual, i.e., Adj. 

We also have two defeasible rules: 


VaVy([Ax & (x= Sprinkler — on) & (y|= Sidewalk — wet) & (x <o y)|O—> Ay) 


Va([x & (2|= Wet)O— Fy[(y <o x) & (y|= Rain))}) 


According to the first rule, states in which the sprinkler is on are normally 
followed by states in which the sidewalk is wet. According to the second rule, 
states in which the sidewalk is wet are normally preceded by states in which 
there is rain. We want to be able to conclude that the sidewalk is wet in dj, 
but not that it is raining in dj. Analyzing this example in terms of Markovian 
locality yields the correct result, since d, is not screened off from itself by de, 
hence we are not able to strengthen the antecedent of the second rule to include 
the fact that the sprinkler is on in state dj. 


B.5.3 The Lifschitz Lamp 


In a case introduced into the literature by Vladimir. Lifschitz, we have a lamp 
with two switches. If both switches are up in a state, or both are down, then 
the lamp is on at that same moment in time. If one switch is up and the other 
is down, the lamp is off. Suppose the lamp is off, switch one is down, and switch 
two is up. Suppose we perform the action of flipping switch one. There are two 
possible outcomes: the lamp comes on, or switch two goes down. Obviously, we 
prefer the first. 

There are seven token states to consider, one for each of the elements (two 
switches and a lamp) both before and after the action, plus the action-token 
itself. Common sense gives us the following information about the causal priority 
relations: 

Even though the lamp’s state is simultaneous with the corresponding setting 
of the switches, there is a relation of causal priority between them: the lamp’s 
state is causally dependent on the switches, not vice versa. Consequently, we 
can state the relevant defeasible rules as follows: 


1. We¥y( (l= Up) & (yl= Up) & (w <o y) & Az|O-> Ay) 
2. VaVy([(z|= Down) & (y= Down) & (x <p y) & Az]O-—> Ay) 
3. VaVy([(z|= OF f) & (y|= OF f) & (x ~o y) & Ax]O— Ay) 


4. VaNy2Nu([[(e] = Up) &(y] = Down) &(z| = On)&(w| = OF f\& 
(2 UyUz) ow) & Ar & Ay & Az]O— Aw) 


322 Realism Regained 


s;Down s,:Up 


Figure B.3: The Lifschitz Lamp 


5. VaVyVzVw({[(z]| = Up) &(y| = Up)&(z| = Of f)&(w| = On)& 
((x Uy z) Xo w) & Ar & Ay & Az]O— Aw) 


The first three are instances of the law of inertia, and the final two express 
some of the dependencies of the lamp on the switch settings. 

In the case at hand, state s3 is screened off from sg by s2. Consequently, 
we can strengthen the antecedent of rule 1 to derive the conclusion Ass. States 
$1, 82,83, and s7 are all screened off from s4 by the sum of s4 and s3. Con- 
sequently, we can strengthen the antecedent of rule 5 and derive the desired 
conclusion, As¢. 


B.5.4 The Emperor’s Colored Blocks 


In the example of the emperor’s colored blocks, we are challenged to represent 
the distinction between qualification constraints (constraints on what can be 
done) from ramification constraints (constraints on what must follow an action 
of a given kind). We have two yellow blocks. Painting a yellow block red always 
changes its color to red. The emperor has decreed that either both blocks must 
be yellow, or neither. The correct conclusion to draw is that it is possible to 
paint both blocks, but not to paint only one. We must avoid the conclusion 
that painting one block causes the other block to change color as well. 

In this case, we have five tokens: two for each block (before and after the 
painting), and one token for the possible painting action. Let us suppose that 
the would-be painting action can affect only the first block, if it affects any block 
at all. The causal structure can be represented as follows: 

We have five corresponding token constants, $1, $2,$3,84, and ss, with the 
stipulations that (s,; : Y), (so : Y),(s1 ~< s3),(s2 < s4),(s3 : Paint),(s3 < 
84,7(81 = 82),(s4 # $5). We have the facts As,, As, and As3. We have as 
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Figure B.4: The Emperor’s Blocks 


defeasible rules the law of inertia for the color of the block. We have a hard rule 
of the following form: 


nOdArdydzdw((x ~o z) & (y ~o z) & (z Fw) & (z= AY) & (wl= Y)) 


In addition, we can either add the fact Asg and a defeasible rule to the effect 
that painting changes the color, or we could omit the fact that Asz and add 
instead a hard rule to the effect that painting always changes the color of the 
block. In the one case, we will draw the conclusion that the attempted painting 
failed to produce its normal result, and in the second case we will conclude that 
$3 is not actual. 

The crucial fact displayed in the diagram is that s, and sg are screened off 
from ss by s9. Hence, inertia will not be overridden in the case of the second 
block, and we avoid the counterintuitive result. 


B.6 Abduction and Induction 


Abduction to unknown causes would appear to add two additional elements 
to the causal calculus sketched so far. First, we must add some version of the 
principle of the universality of causation. Second, we must add some assumption 
about the completeness of our theory of the actual causal laws. Given these 
assumptions, we can deduce, given any set of facts, the disjunction of the most 
economical possible causes of those facts, where economy is measured in terms 
of the set of situation-tokens, the set of causal connections, and the set of types 
supported by each token. 

In the case of induction, we must obviously give up the assumption that our 
current theory of causal laws is complete. The most difficult problem to solve 
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is this: when do we prefer the addition of a new causal law to the postulation 
of causal factors adequate to explaining the phenomena by means of already- 
known laws? In addition, we must have some sort of measure of the simplicity of 
possible laws. This should enable us to avoid Goodman’s grue/bleen problem by 
exploiting the facts that situation-types are first-class entities in our ontology, 
and that strong-Kleene and Dunn evaluations enable us to make extremely fine- 
grained distinctions between facts. 


B.7 Proofs of Theorems B.2 through B.4 


Theorem B.2 (From Causal to Probabilistic Screening Off) 


If o(81, $2, $3) and (s4>s,) then Pr(Aso/As, & As3 & Asq) 
= Pr(Aso/As3 & Asa). 


Proof: First, I need the following lemma from standard probability theory. 


Lemma B.1 (Symmetry of Probabilistic Screening Off) 
Pr(X/YZ) = Pr(X/Y) iff Pr(Z/XY) = Pr(Z/Y). 


Proof of Lemma: Assume Pr(X/YZ) = Pr(X/Y). 
Pr(XYZ) Pr(XY) 


Pr(YZ)—-~Pr(¥) 
Pr(X¥Z) _ Pr(¥Z) 
Pr(XY) ~  Pr(Y) 
P(Z/XY) = P(Z/Y) 


End of Proof of Lemma 


Proof of Theorem B.2: To make the proof somewhat easier to read, let me 
introduce the following scheme of abbreviation: 


e As; =A 
e Aso=B 
e Asg=C 
e Asg =D 


The proof will proceed by induction on the maximal length of a causal chain 
from sz to sq (from C to B). 

Base Case: s3 (C’) is immediately prior to sz (B). Let ss be the token such 
that s3 LJs5 = P,,(s2), and abbreviate As; as E. By probabilistic locality, we 
know that Pr(B/CDE) = Pr(B/ACDE). This entails that: 


Pr(BCDE) _ Pr(ABCDE) 


Pr(CDE) Pr(ACDE) Gay 
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We need to show that Pr(B/CD) = Pr(B/ACD). 
Since A is causally screened off from B by C, it must be the case that 
A and F& have no common cause. Consequently, we can apply Reichenbach’s 
rule and derive that A and E are probabilistically independent, conditional on 
BCD. This entails that Pr(ABCDE) = P™A8ED) Frc? =). Substituting the 
right-hand side of the equality in equation (B.1) above yields: 
Pr(BCDE) _ Pr(ABCD).- Pr(BCDE) 


Pr(CDE)  Pr(ACDE)- Pr(BCD) (B.2) 


Canceling out Pr(BCDE) on both sides gives us: 
it Pr(ABCD) 


Pr(CDE) - Pr(ACDE) - Pr(BCD) B33) 
A little more algebraic manipulation gives us: 
Pr(BCD) Pr(CDE) (B.4) 


Pr(ABCD)  Pr(ACDE) 


Equation (B.4) entails that Pr(A/BCD) = Pr(A/CDE). 

Another application of Reichenbach’s rule gives us that the probability of A 
and £ are probabilistically independent, conditional on CD. This entails that 
Pr(A/CD) = Pr(A/CDE). By substitution, we have that Pr(A/BCD) = 
Pr(A/CD). By the lemma of symmetry of independence, we have that 
Pr(B/ACD) = Pr(B/CD). End of Base Case. 


Inductive Case: Let us assume that the maximal causal chain from C' to B 
is of length n. For our inductive hypothesis, we assume that the theorem holds 
whenever the length of the causal chain is less than n. Let F represent the actu- 
ality of the sum of the immediate causal antecedents of B. The maximal length 
of a causal chain from C to F is less than n, so the inductive hypothesis applies. 
Since C screens off A from B, it must also screen off A from F’. Consequently, 
by the inductive hypothesis, we have that Pr(F/ACD) = Pr(F/CD). By the 
symmetry lemma, this entails Pr(A/CDF) = Pr(A/CD). 

By an application of probabilistic locality, we have Pr(B/CDF) 
Pr(B/ACDF). By the Symmetry Lemma, we have Pr(A/CDF) 
Pr(A/BCDF). By substitution, we derive: 


fll 


Pr(A/CD) = Pr(A/BCDF) (B.5) 

Another application of the inductive hypothesis gives us that Pr(F/ABCD) 

= Pr(F/BCD), since B is not posterior to A. Again, by the symmetry lemma, 

this is equivalent to Pr(A/BCDF) = Pr(A/BCD). Again, by substitution we 
can reach: 


Pr(A/CD) = Pr(A/BCD) (B.6) 


Applying the symmetry lemma again results in: 
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Pr(B/CD) = Pr(B/ACD) (B.7) 
QED. 


Theorem B.3 (Soundness of Modular Inferences) 


The following are logically equivalent, given probabilistic (Marko- 
vian) locality and Reichenbach’s rule (where ® is any modally closed 


formula): 
([® & o(dy dg, d3) & Ad3|O— Adz) 


({® & o(dy, dg, dy) & Ad) & Ad3]O— Adz) 


Proof: Suppose that some model M and world w make one of the condition- 
als above true. In this case, since facts about screening off are non-contingent, 
it must be that o({|(d1)||, ||(d2)|I, ||(ds)||) is either impossible or necessary. If 
it is impossible, then both antecedents are impossible, and both conditionals 
are true, relative to M,w. If it is necessary, then every admissible probability 
function must have A(d,) screened off from A(d2) by A(d3). The conditional set 
selection function f in M must respect these constraints on admissible proba- 
bility functions. Hence, if one conditional is true, so must the other be. QED 


Theorem B.4 (Soundness of Causal Projection) 


If every maximally preferred world in any model that verifies a 
modally closed formula ® also verifies o(d1,d2,d3) in that same 
model, then the following inference is nonmonotonically correct (in 
the class of models satisfying Occam’s Razor, Miller’s principle, Re- 
ichenbach’s rule, and Markov locality): 


G, Ady, Ads, (Ad30— Adz)|® Ad 


Proof: The nonmonotonic inference is correct if and only if the corresponding 
O-— + conditional is logically valid. Therefore, it is sufficient to show that the 
conditional is verified by every assignment function in every model. Let M be an 
arbitrary model, and w an arbitrary world. There are two cases to consider: (1) 
for no world w’ accessible to w do M,w’ verify the antecedent (the premises of 
the argument), and (2) there is at least one world w’ accessible to w verifying the 
antecedent. In case (1), the truth-conditions of the O— conditional guarantee 
that the conditional is verified by M,w. Thus, assume that we have case (2). 

Let A be the set which the selection function f selects for the world w and 
the antecedent of the conditional. Let w’ be an arbitrary world in A. 

First, I can prove that w’ is a maximally preferred world that verifies the 
premises. Suppose for contradiction that it is not. In this case there is an 
isomorphic world w” that is strictly preferred to w’, by virtue of having a 
leaner causal structure, and w” also verifies the premises relative to M for some 
assignment function h. Consider the set AU {w’} — {w’}. This set is strictly 
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preferred to A. M must satisfy Occam’s razor. So, it is this set, and not A, 
that the selection function f must assign to the world w and the antecedent of 
the conditional (the conjunction of the premises), contrary to our assumption. 

So w’ is maximally preferred, among those worlds verifying the premises 
in M. Therefore, by the hypothesis of the theorem, w’ also verifies o(||(d1)||, 
||(d2)||, |}(d3)||). In this case, by Theorem B.3, it follows that, since M, w’ verify 
the conditional (Ad30-— Adz), they must also verify ((Ad, & Ad3)O— Adz). 
Since the other premises are non-contingent, they too can be added to the 
antecedent, resulting in: 


M,w! & ([®(di, de, ds) & Ad; & Ads]O— Adz) 


Since w’ was an arbitrary member assigned by the selection function f to 
world w and the set of worlds verifying the antecedent, we have: 


M,w kk ([® & Ady & Ad3]O> 
([ & Ad; & Ad3]O— Adz)) 


Finally, an application of axiom ($1) gives us that M, w verifies a conditional 
having the premises as its antecedent and the conclusion as its consequent. QED 
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